Crypto++ 8.2
Free C&
integer.cpp
1// integer.cpp - originally written and placed in the public domain by Wei Dai
2// contains public domain code contributed by Alister Lee and Leonard Janke
3
4// Notes by JW: The Integer class needs to do two things. First, it needs
5// to set function pointers on some platforms, like X86 and X64. The
6// function pointers select a fast multiply and addition based on the cpu.
7// Second, it wants to create Integer::Zero(), Integer::One() and
8// Integer::Two().
9// The function pointers are initialized in the InitializeInteger class by
10// calling SetFunctionPointers(). The call to SetFunctionPointers() is
11// guarded to run once using a double-checked pattern. We don't use C++
12// std::call_once due to bad interactions between libstdc++, glibc and
13// pthreads. The bad interactions were causing crashes for us on platforms
14// like Sparc and PowerPC. Since we are only setting function pointers we
15// don't have to worry about leaking memory. The worst case seems to be the
16// pointers gets written twice until the init flag is set and visible to
17// all threads.
18// For Integer::Zero(), Integer::One() and Integer::Two(), we use one of three
19// strategies. First, if initialization priorities are available then we use
20// them. Initialization priorities are init_priority() on Linux and init_seg()
21// on Windows. AIX, OS X and several other platforms lack them. Initialization
22// priorities are platform specific but they are also the most trouble free
23// with determisitic destruction.
24// Second, if C++11 dynamic initialization is available, then we use it. After
25// the std::call_once fiasco we dropped the priority dynamic initialization
26// to avoid unknown troubles platforms that are tested less frequently. In
27// addition Microsoft platforms mostly do not provide dynamic initialization.
28// The MSDN docs claim they do but they don't in practice because we need
29// Visual Studio 2017 and Windows 10 or above.
30// Third, we fall back to Wei's original code of a Singleton. Wei's original
31// code was much simpler. It simply used the Singleton pattern, but it always
32// produced memory findings on some platforms. The Singleton generates memory
33// findings because it uses a Create On First Use pattern (a dumb Nifty
34// Counter) and the compiler had to be smart enough to fold them to return
35// the same object. Unix and Linux compilers do a good job of folding objects,
36// but Microsoft compilers do a rather poor job for some versions of the
37// compiler.
38// Another problem with the Singleton is resource destruction requires running
39// resource acquisition in reverse. For resources provided through the
40// Singletons, there is no way to express the dependency order to safely
41// destroy resources. (That's one of the problems C++11 dynamic
42// intitialization with concurrent execution is supposed to solve).
43// The final problem with Singletons is resource/memory exhaustion in languages
44// like Java and .Net. Java and .Net load and unload a native DLL hundreds or
45// thousands of times during the life of a program. Each load produces a
46// memory leak and they can add up quickly. If they library is being used in
47// Java or .Net then Singleton must be avoided at all costs.
48
49#include "pch.h"
50#include "config.h"
51
52#ifndef CRYPTOPP_IMPORTS
53
54#include "integer.h"
55#include "secblock.h"
56#include "modarith.h"
57#include "nbtheory.h"
58#include "smartptr.h"
59#include "algparam.h"
60#include "filters.h"
61#include "stdcpp.h"
62#include "asn.h"
63#include "oids.h"
64#include "words.h"
65#include "pubkey.h" // for P1363_KDF2
66#include "sha.h"
67#include "cpu.h"
68#include "misc.h"
69
70#include <iostream>
71
72#if (_MSC_VER >= 1400) && !defined(_M_ARM)
73 #include <intrin.h>
74#endif
75
76#ifdef __DECCXX
77 #include <c_asm.h>
78#endif
79
80// "Error: The operand ___LKDB cannot be assigned to", http://github.com/weidai11/cryptopp/issues/188
81#if (__SUNPRO_CC >= 0x5130)
82# define MAYBE_CONST
83# define MAYBE_UNCONST_CAST(x) const_cast<word*>(x)
84#else
85# define MAYBE_CONST const
86# define MAYBE_UNCONST_CAST(x) x
87#endif
88
89// "Inline assembly operands don't work with .intel_syntax",
90// http://llvm.org/bugs/show_bug.cgi?id=24232
91#if CRYPTOPP_BOOL_X32 || defined(CRYPTOPP_DISABLE_MIXED_ASM)
92# undef CRYPTOPP_X86_ASM_AVAILABLE
93# undef CRYPTOPP_X32_ASM_AVAILABLE
94# undef CRYPTOPP_X64_ASM_AVAILABLE
95# undef CRYPTOPP_SSE2_ASM_AVAILABLE
96# undef CRYPTOPP_SSSE3_ASM_AVAILABLE
97#else
98# define CRYPTOPP_INTEGER_SSE2 (CRYPTOPP_SSE2_ASM_AVAILABLE && (CRYPTOPP_BOOL_X86))
99#endif
100
101// ***************** C++ Static Initialization ********************
102
103NAMESPACE_BEGIN(CryptoPP)
104
105// Function body near the middle of the file
106static void SetFunctionPointers();
107
108// Use a double-checked pattern. We are not leaking anything so it
109// does not matter if a pointer is written twice during a race.
110// Avoid std::call_once due to too many problems on platforms like
111// Solaris and Sparc. Also see
112// http://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146 and
113// http://github.com/weidai11/cryptopp/issues/707.
114InitializeInteger::InitializeInteger()
115{
116 static bool s_flag;
118 if (s_flag == false)
119 {
120 SetFunctionPointers();
121 s_flag = true;
123 }
124}
125
126template <long i>
128{
129 Integer * operator()() const
130 {
131 return new Integer(i);
132 }
133};
134
135// ***************** Library code ********************
136
137inline static int Compare(const word *A, const word *B, size_t N)
138{
139 while (N--)
140 if (A[N] > B[N])
141 return 1;
142 else if (A[N] < B[N])
143 return -1;
144
145 return 0;
146}
147
148inline static int Increment(word *A, size_t N, word B=1)
149{
151 word t = A[0];
152 A[0] = t+B;
153 if (A[0] >= t)
154 return 0;
155 for (unsigned i=1; i<N; i++)
156 if (++A[i])
157 return 0;
158 return 1;
159}
160
161inline static int Decrement(word *A, size_t N, word B=1)
162{
164 word t = A[0];
165 A[0] = t-B;
166 if (A[0] <= t)
167 return 0;
168 for (unsigned i=1; i<N; i++)
169 if (A[i]--)
170 return 0;
171 return 1;
172}
173
174static void TwosComplement(word *A, size_t N)
175{
176 Decrement(A, N);
177 for (unsigned i=0; i<N; i++)
178 A[i] = ~A[i];
179}
180
181static word AtomicInverseModPower2(word A)
182{
183 CRYPTOPP_ASSERT(A%2==1);
184
185 word R=A%8;
186
187 for (unsigned i=3; i<WORD_BITS; i*=2)
188 R = R*(2-R*A);
189
190 CRYPTOPP_ASSERT(R*A==1);
191 return R;
192}
193
194// ********************************************************
195
196#if !defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE) || (defined(__x86_64__) && defined(CRYPTOPP_WORD128_AVAILABLE))
197 #define TWO_64_BIT_WORDS 1
198 #define Declare2Words(x) word x##0, x##1;
199 #define AssignWord(a, b) a##0 = b; a##1 = 0;
200 #define Add2WordsBy1(a, b, c) a##0 = b##0 + c; a##1 = b##1 + (a##0 < c);
201 #define LowWord(a) a##0
202 #define HighWord(a) a##1
203 #ifdef _MSC_VER
204 #define MultiplyWordsLoHi(p0, p1, a, b) p0 = _umul128(a, b, &p1);
205 #ifndef __INTEL_COMPILER
206 #define Double3Words(c, d) d##1 = __shiftleft128(d##0, d##1, 1); d##0 = __shiftleft128(c, d##0, 1); c *= 2;
207 #endif
208 #elif defined(__DECCXX)
209 #define MultiplyWordsLoHi(p0, p1, a, b) p0 = a*b; p1 = asm("umulh %a0, %a1, %v0", a, b);
210 #elif defined(__x86_64__)
211 #if defined(__SUNPRO_CC) && __SUNPRO_CC < 0x5100
212 // Sun Studio's gcc-style inline assembly is heavily bugged as of version 5.9 Patch 124864-09 2008/12/16, but this one works
213 #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "r"(b) : "cc");
214 #else
215 #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
216 #define MulAcc(c, d, a, b) asm ("mulq %6; addq %3, %0; adcq %4, %1; adcq $0, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1), "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
217 #define Double3Words(c, d) asm ("addq %0, %0; adcq %1, %1; adcq %2, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1) : : "cc");
218 #define Acc2WordsBy1(a, b) asm ("addq %2, %0; adcq $0, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b) : "cc");
219 #define Acc2WordsBy2(a, b) asm ("addq %2, %0; adcq %3, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b##0), "r"(b##1) : "cc");
220 #define Acc3WordsBy2(c, d, e) asm ("addq %5, %0; adcq %6, %1; adcq $0, %2;" : "+r"(c), "=r"(e##0), "=r"(e##1) : "1"(d##0), "2"(d##1), "r"(e##0), "r"(e##1) : "cc");
221 #endif
222 #endif
223 #define MultiplyWords(p, a, b) MultiplyWordsLoHi(p##0, p##1, a, b)
224 #ifndef Double3Words
225 #define Double3Words(c, d) d##1 = 2*d##1 + (d##0>>(WORD_BITS-1)); d##0 = 2*d##0 + (c>>(WORD_BITS-1)); c *= 2;
226 #endif
227 #ifndef Acc2WordsBy2
228 #define Acc2WordsBy2(a, b) a##0 += b##0; a##1 += a##0 < b##0; a##1 += b##1;
229 #endif
230 #define AddWithCarry(u, a, b) {word t = a+b; u##0 = t + u##1; u##1 = (t<a) + (u##0<t);}
231 #define SubtractWithBorrow(u, a, b) {word t = a-b; u##0 = t - u##1; u##1 = (t>a) + (u##0>t);}
232 #define GetCarry(u) u##1
233 #define GetBorrow(u) u##1
234#else
235 #define Declare2Words(x) dword x;
236 #if _MSC_VER >= 1400 && !defined(__INTEL_COMPILER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_IA64))
237 #define MultiplyWords(p, a, b) p = __emulu(a, b);
238 #else
239 #define MultiplyWords(p, a, b) p = (dword)a*b;
240 #endif
241 #define AssignWord(a, b) a = b;
242 #define Add2WordsBy1(a, b, c) a = b + c;
243 #define Acc2WordsBy2(a, b) a += b;
244 #define LowWord(a) word(a)
245 #define HighWord(a) word(a>>WORD_BITS)
246 #define Double3Words(c, d) d = 2*d + (c>>(WORD_BITS-1)); c *= 2;
247 #define AddWithCarry(u, a, b) u = dword(a) + b + GetCarry(u);
248 #define SubtractWithBorrow(u, a, b) u = dword(a) - b - GetBorrow(u);
249 #define GetCarry(u) HighWord(u)
250 #define GetBorrow(u) word(u>>(WORD_BITS*2-1))
251#endif
252#ifndef MulAcc
253 #define MulAcc(c, d, a, b) MultiplyWords(p, a, b); Acc2WordsBy1(p, c); c = LowWord(p); Acc2WordsBy1(d, HighWord(p));
254#endif
255#ifndef Acc2WordsBy1
256 #define Acc2WordsBy1(a, b) Add2WordsBy1(a, a, b)
257#endif
258#ifndef Acc3WordsBy2
259 #define Acc3WordsBy2(c, d, e) Acc2WordsBy1(e, c); c = LowWord(e); Add2WordsBy1(e, d, HighWord(e));
260#endif
261
262class DWord
263{
264public:
265#if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
266 DWord() {std::memset(&m_whole, 0x00, sizeof(m_whole));}
267#else
268 DWord() {std::memset(&m_halfs, 0x00, sizeof(m_halfs));}
269#endif
270
271#ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
272 explicit DWord(word low) : m_whole(low) { }
273#else
274 explicit DWord(word low)
275 {
276 m_halfs.high = 0;
277 m_halfs.low = low;
278 }
279#endif
280
281#if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
282 DWord(word low, word high) : m_whole()
283#else
284 DWord(word low, word high) : m_halfs()
285#endif
286 {
287#if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
288# if (CRYPTOPP_LITTLE_ENDIAN)
289 const word t[2] = {low,high};
290 memcpy(&m_whole, t, sizeof(m_whole));
291# else
292 const word t[2] = {high,low};
293 memcpy(&m_whole, t, sizeof(m_whole));
294# endif
295#else
296 m_halfs.low = low;
297 m_halfs.high = high;
298#endif
299 }
300
301 static DWord Multiply(word a, word b)
302 {
303 DWord r;
304 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
305 r.m_whole = (dword)a * b;
306 #elif defined(MultiplyWordsLoHi)
307 MultiplyWordsLoHi(r.m_halfs.low, r.m_halfs.high, a, b);
308 #else
310 #endif
311 return r;
312 }
313
314 static DWord MultiplyAndAdd(word a, word b, word c)
315 {
316 DWord r = Multiply(a, b);
317 return r += c;
318 }
319
320 DWord & operator+=(word a)
321 {
322 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
323 m_whole = m_whole + a;
324 #else
325 m_halfs.low += a;
326 m_halfs.high += (m_halfs.low < a);
327 #endif
328 return *this;
329 }
330
331 DWord operator+(word a)
332 {
333 DWord r;
334 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
335 r.m_whole = m_whole + a;
336 #else
337 r.m_halfs.low = m_halfs.low + a;
338 r.m_halfs.high = m_halfs.high + (r.m_halfs.low < a);
339 #endif
340 return r;
341 }
342
343 DWord operator-(DWord a)
344 {
345 DWord r;
346 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
347 r.m_whole = m_whole - a.m_whole;
348 #else
349 r.m_halfs.low = m_halfs.low - a.m_halfs.low;
350 r.m_halfs.high = m_halfs.high - a.m_halfs.high - (r.m_halfs.low > m_halfs.low);
351 #endif
352 return r;
353 }
354
355 DWord operator-(word a)
356 {
357 DWord r;
358 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
359 r.m_whole = m_whole - a;
360 #else
361 r.m_halfs.low = m_halfs.low - a;
362 r.m_halfs.high = m_halfs.high - (r.m_halfs.low > m_halfs.low);
363 #endif
364 return r;
365 }
366
367 // returns quotient, which must fit in a word
368 word operator/(word divisor);
369
370 word operator%(word a);
371
372 bool operator!() const
373 {
374 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
375 return !m_whole;
376 #else
377 return !m_halfs.high && !m_halfs.low;
378 #endif
379 }
380
381 // TODO: When NATIVE_DWORD is in effect, we access high and low, which are inactive
382 // union members, and that's UB. Also see http://stackoverflow.com/q/11373203.
383 word GetLowHalf() const {return m_halfs.low;}
384 word GetHighHalf() const {return m_halfs.high;}
385 word GetHighHalfAsBorrow() const {return 0-m_halfs.high;}
386
387private:
388 // Issue 274, "Types cannot be declared in anonymous union",
389 // http://github.com/weidai11/cryptopp/issues/274
390 // Thanks to Martin Bonner at http://stackoverflow.com/a/39507183
391 struct half_words
392 {
393 #if (CRYPTOPP_LITTLE_ENDIAN)
394 word low;
395 word high;
396 #else
397 word high;
398 word low;
399 #endif
400 };
401 union
402 {
403 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
404 dword m_whole;
405 #endif
406 half_words m_halfs;
407 };
408};
409
410class Word
411{
412public:
413 Word() : m_whole(0) {}
414 Word(word value) : m_whole(value) {}
415 Word(hword low, hword high) : m_whole(low | (word(high) << (WORD_BITS/2))) {}
416
417 static Word Multiply(hword a, hword b)
418 {
419 Word r;
420 r.m_whole = (word)a * b;
421 return r;
422 }
423
424 Word operator-(Word a)
425 {
426 Word r;
427 r.m_whole = m_whole - a.m_whole;
428 return r;
429 }
430
431 Word operator-(hword a)
432 {
433 Word r;
434 r.m_whole = m_whole - a;
435 return r;
436 }
437
438 // returns quotient, which must fit in a word
439 hword operator/(hword divisor)
440 {
441 return hword(m_whole / divisor);
442 }
443
444 bool operator!() const
445 {
446 return !m_whole;
447 }
448
449 word GetWhole() const {return m_whole;}
450 hword GetLowHalf() const {return hword(m_whole);}
451 hword GetHighHalf() const {return hword(m_whole>>(WORD_BITS/2));}
452 hword GetHighHalfAsBorrow() const {return 0-hword(m_whole>>(WORD_BITS/2));}
453
454private:
455 word m_whole;
456};
457
458// do a 3 word by 2 word divide, returns quotient and leaves remainder in A
459template <class S, class D>
460S DivideThreeWordsByTwo(S *A, S B0, S B1, D *dummy=NULLPTR)
461{
462 CRYPTOPP_UNUSED(dummy);
463
464 // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a S
465 CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
466
467 // estimate the quotient: do a 2 S by 1 S divide.
468 // Profiling tells us the original second case was dominant, so it was promoted to the first If statement.
469 // The code change occurred at Commit dc99266599a0e72d.
470
471 S Q; bool pre = (S(B1+1) == 0);
472 if (B1 > 0 && !pre)
473 Q = D(A[1], A[2]) / S(B1+1);
474 else if (pre)
475 Q = A[2];
476 else
477 Q = D(A[0], A[1]) / B0;
478
479 // now subtract Q*B from A
480 D p = D::Multiply(B0, Q);
481 D u = (D) A[0] - p.GetLowHalf();
482 A[0] = u.GetLowHalf();
483 u = (D) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - D::Multiply(B1, Q);
484 A[1] = u.GetLowHalf();
485 A[2] += u.GetHighHalf();
486
487 // Q <= actual quotient, so fix it
488 while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
489 {
490 u = (D) A[0] - B0;
491 A[0] = u.GetLowHalf();
492 u = (D) A[1] - B1 - u.GetHighHalfAsBorrow();
493 A[1] = u.GetLowHalf();
494 A[2] += u.GetHighHalf();
495 Q++;
496 CRYPTOPP_ASSERT(Q); // shouldn't overflow
497 }
498
499 return Q;
500}
501
502// do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
503template <class S, class D>
504inline D DivideFourWordsByTwo(S *T, const D &Al, const D &Ah, const D &B)
505{
506 // Profiling tells us the original second case was dominant, so it was
507 // promoted to the first If statement. The code change occurred at
508 // Commit dc99266599a0e72d.
509
510 if (!!B)
511 {
512 S Q[2];
513 T[0] = Al.GetLowHalf();
514 T[1] = Al.GetHighHalf();
515 T[2] = Ah.GetLowHalf();
516 T[3] = Ah.GetHighHalf();
517 Q[1] = DivideThreeWordsByTwo<S, D>(T+1, B.GetLowHalf(), B.GetHighHalf());
518 Q[0] = DivideThreeWordsByTwo<S, D>(T, B.GetLowHalf(), B.GetHighHalf());
519 return D(Q[0], Q[1]);
520 }
521 else // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
522 {
523 return D(Ah.GetLowHalf(), Ah.GetHighHalf());
524 }
525}
526
527// returns quotient, which must fit in a word
528inline word DWord::operator/(word a)
529{
530 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
531 return word(m_whole / a);
532 #else
533 hword r[4];
534 return DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a).GetWhole();
535 #endif
536}
537
538inline word DWord::operator%(word a)
539{
540 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
541 return word(m_whole % a);
542 #else
543 if (a < (word(1) << (WORD_BITS/2)))
544 {
545 hword h = hword(a);
546 word r = m_halfs.high % h;
547 r = ((m_halfs.low >> (WORD_BITS/2)) + (r << (WORD_BITS/2))) % h;
548 return hword((hword(m_halfs.low) + (r << (WORD_BITS/2))) % h);
549 }
550 else
551 {
552 hword r[4];
553 DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a);
554 return Word(r[0], r[1]).GetWhole();
555 }
556 #endif
557}
558
559// ********************************************************
560
561// Use some tricks to share assembly code between MSVC, GCC, Clang and Sun CC.
562#if defined(__GNUC__)
563 #define AddPrologue \
564 int result; \
565 __asm__ __volatile__ \
566 ( \
567 INTEL_NOPREFIX
568 #define AddEpilogue \
569 ATT_PREFIX \
570 : "=a" (result)\
571 : "d" (C), "a" (A), "D" (B), "c" (N) \
572 : "%esi", "memory", "cc" \
573 );\
574 return result;
575 #define MulPrologue \
576 __asm__ __volatile__ \
577 ( \
578 INTEL_NOPREFIX \
579 AS1( push ebx) \
580 AS2( mov ebx, edx)
581 #define MulEpilogue \
582 AS1( pop ebx) \
583 ATT_PREFIX \
584 : \
585 : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B) \
586 : "%esi", "memory", "cc" \
587 );
588 #define SquPrologue MulPrologue
589 #define SquEpilogue \
590 AS1( pop ebx) \
591 ATT_PREFIX \
592 : \
593 : "d" (s_maskLow16), "c" (C), "a" (A) \
594 : "%esi", "%edi", "memory", "cc" \
595 );
596 #define TopPrologue MulPrologue
597 #define TopEpilogue \
598 AS1( pop ebx) \
599 ATT_PREFIX \
600 : \
601 : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B), "S" (L) \
602 : "memory", "cc" \
603 );
604#else
605 #define AddPrologue \
606 __asm push edi \
607 __asm push esi \
608 __asm mov eax, [esp+12] \
609 __asm mov edi, [esp+16]
610 #define AddEpilogue \
611 __asm pop esi \
612 __asm pop edi \
613 __asm ret 8
614 #define SaveEBX
615 #define RestoreEBX
616 #define SquPrologue \
617 AS2( mov eax, A) \
618 AS2( mov ecx, C) \
619 SaveEBX \
620 AS2( lea ebx, s_maskLow16)
621 #define MulPrologue \
622 AS2( mov eax, A) \
623 AS2( mov edi, B) \
624 AS2( mov ecx, C) \
625 SaveEBX \
626 AS2( lea ebx, s_maskLow16)
627 #define TopPrologue \
628 AS2( mov eax, A) \
629 AS2( mov edi, B) \
630 AS2( mov ecx, C) \
631 AS2( mov esi, L) \
632 SaveEBX \
633 AS2( lea ebx, s_maskLow16)
634 #define SquEpilogue RestoreEBX
635 #define MulEpilogue RestoreEBX
636 #define TopEpilogue RestoreEBX
637#endif
638
639#ifdef CRYPTOPP_X64_MASM_AVAILABLE
640extern "C" {
641int Baseline_Add(size_t N, word *C, const word *A, const word *B);
642int Baseline_Sub(size_t N, word *C, const word *A, const word *B);
643}
644#elif defined(CRYPTOPP_X64_ASM_AVAILABLE) && defined(__GNUC__) && defined(CRYPTOPP_WORD128_AVAILABLE)
645int Baseline_Add(size_t N, word *C, const word *A, const word *B)
646{
647 word result;
648 __asm__ __volatile__
649 (
650 INTEL_NOPREFIX
651 AS1( neg %1)
652 ASJ( jz, 1, f)
653 AS2( mov %0,[%3+8*%1])
654 AS2( add %0,[%4+8*%1])
655 AS2( mov [%2+8*%1],%0)
656 ASL(0)
657 AS2( mov %0,[%3+8*%1+8])
658 AS2( adc %0,[%4+8*%1+8])
659 AS2( mov [%2+8*%1+8],%0)
660 AS2( lea %1,[%1+2])
661 ASJ( jrcxz, 1, f)
662 AS2( mov %0,[%3+8*%1])
663 AS2( adc %0,[%4+8*%1])
664 AS2( mov [%2+8*%1],%0)
665 ASJ( jmp, 0, b)
666 ASL(1)
667 AS2( mov %0, 0)
668 AS2( adc %0, %0)
669 ATT_NOPREFIX
670 : "=&r" (result), "+c" (N)
671 : "r" (C+N), "r" (A+N), "r" (B+N)
672 : "memory", "cc"
673 );
674 return (int)result;
675}
676
677int Baseline_Sub(size_t N, word *C, const word *A, const word *B)
678{
679 word result;
680 __asm__ __volatile__
681 (
682 INTEL_NOPREFIX
683 AS1( neg %1)
684 ASJ( jz, 1, f)
685 AS2( mov %0,[%3+8*%1])
686 AS2( sub %0,[%4+8*%1])
687 AS2( mov [%2+8*%1],%0)
688 ASL(0)
689 AS2( mov %0,[%3+8*%1+8])
690 AS2( sbb %0,[%4+8*%1+8])
691 AS2( mov [%2+8*%1+8],%0)
692 AS2( lea %1,[%1+2])
693 ASJ( jrcxz, 1, f)
694 AS2( mov %0,[%3+8*%1])
695 AS2( sbb %0,[%4+8*%1])
696 AS2( mov [%2+8*%1],%0)
697 ASJ( jmp, 0, b)
698 ASL(1)
699 AS2( mov %0, 0)
700 AS2( adc %0, %0)
701 ATT_NOPREFIX
702 : "=&r" (result), "+c" (N)
703 : "r" (C+N), "r" (A+N), "r" (B+N)
704 : "memory", "cc"
705 );
706 return (int)result;
707}
708#elif defined(CRYPTOPP_X86_ASM_AVAILABLE) && CRYPTOPP_BOOL_X86
709CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
710{
711 AddPrologue
712
713 // now: eax = A, edi = B, edx = C, ecx = N
714 AS2( lea eax, [eax+4*ecx])
715 AS2( lea edi, [edi+4*ecx])
716 AS2( lea edx, [edx+4*ecx])
717
718 AS1( neg ecx) // ecx is negative index
719 AS2( test ecx, 2) // this clears carry flag
720 ASJ( jz, 0, f)
721 AS2( sub ecx, 2)
722 ASJ( jmp, 1, f)
723
724 ASL(0)
725 ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
726 AS2( mov esi,[eax+4*ecx])
727 AS2( adc esi,[edi+4*ecx])
728 AS2( mov [edx+4*ecx],esi)
729 AS2( mov esi,[eax+4*ecx+4])
730 AS2( adc esi,[edi+4*ecx+4])
731 AS2( mov [edx+4*ecx+4],esi)
732 ASL(1)
733 AS2( mov esi,[eax+4*ecx+8])
734 AS2( adc esi,[edi+4*ecx+8])
735 AS2( mov [edx+4*ecx+8],esi)
736 AS2( mov esi,[eax+4*ecx+12])
737 AS2( adc esi,[edi+4*ecx+12])
738 AS2( mov [edx+4*ecx+12],esi)
739
740 AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
741 ASJ( jmp, 0, b)
742
743 ASL(2)
744 AS2( mov eax, 0)
745 AS1( setc al) // store carry into eax (return result register)
746
747 AddEpilogue
748
749 // http://github.com/weidai11/cryptopp/issues/340
750 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
751 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
752}
753
754CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
755{
756 AddPrologue
757
758 // now: eax = A, edi = B, edx = C, ecx = N
759 AS2( lea eax, [eax+4*ecx])
760 AS2( lea edi, [edi+4*ecx])
761 AS2( lea edx, [edx+4*ecx])
762
763 AS1( neg ecx) // ecx is negative index
764 AS2( test ecx, 2) // this clears carry flag
765 ASJ( jz, 0, f)
766 AS2( sub ecx, 2)
767 ASJ( jmp, 1, f)
768
769 ASL(0)
770 ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
771 AS2( mov esi,[eax+4*ecx])
772 AS2( sbb esi,[edi+4*ecx])
773 AS2( mov [edx+4*ecx],esi)
774 AS2( mov esi,[eax+4*ecx+4])
775 AS2( sbb esi,[edi+4*ecx+4])
776 AS2( mov [edx+4*ecx+4],esi)
777 ASL(1)
778 AS2( mov esi,[eax+4*ecx+8])
779 AS2( sbb esi,[edi+4*ecx+8])
780 AS2( mov [edx+4*ecx+8],esi)
781 AS2( mov esi,[eax+4*ecx+12])
782 AS2( sbb esi,[edi+4*ecx+12])
783 AS2( mov [edx+4*ecx+12],esi)
784
785 AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
786 ASJ( jmp, 0, b)
787
788 ASL(2)
789 AS2( mov eax, 0)
790 AS1( setc al) // store carry into eax (return result register)
791
792 AddEpilogue
793
794 // http://github.com/weidai11/cryptopp/issues/340
795 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
796 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
797}
798
799#if CRYPTOPP_INTEGER_SSE2
800CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Add(size_t N, word *C, const word *A, const word *B)
801{
802 AddPrologue
803
804 // now: eax = A, edi = B, edx = C, ecx = N
805 AS2( lea eax, [eax+4*ecx])
806 AS2( lea edi, [edi+4*ecx])
807 AS2( lea edx, [edx+4*ecx])
808
809 AS1( neg ecx) // ecx is negative index
810 AS2( pxor mm2, mm2)
811 ASJ( jz, 2, f)
812 AS2( test ecx, 2) // this clears carry flag
813 ASJ( jz, 0, f)
814 AS2( sub ecx, 2)
815 ASJ( jmp, 1, f)
816
817 ASL(0)
818 AS2( movd mm0, DWORD PTR [eax+4*ecx])
819 AS2( movd mm1, DWORD PTR [edi+4*ecx])
820 AS2( paddq mm0, mm1)
821 AS2( paddq mm2, mm0)
822 AS2( movd DWORD PTR [edx+4*ecx], mm2)
823 AS2( psrlq mm2, 32)
824
825 AS2( movd mm0, DWORD PTR [eax+4*ecx+4])
826 AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
827 AS2( paddq mm0, mm1)
828 AS2( paddq mm2, mm0)
829 AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
830 AS2( psrlq mm2, 32)
831
832 ASL(1)
833 AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
834 AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
835 AS2( paddq mm0, mm1)
836 AS2( paddq mm2, mm0)
837 AS2( movd DWORD PTR [edx+4*ecx+8], mm2)
838 AS2( psrlq mm2, 32)
839
840 AS2( movd mm0, DWORD PTR [eax+4*ecx+12])
841 AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
842 AS2( paddq mm0, mm1)
843 AS2( paddq mm2, mm0)
844 AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
845 AS2( psrlq mm2, 32)
846
847 AS2( add ecx, 4)
848 ASJ( jnz, 0, b)
849
850 ASL(2)
851 AS2( movd eax, mm2)
852 AS1( emms)
853
854 AddEpilogue
855
856 // http://github.com/weidai11/cryptopp/issues/340
857 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
858 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
859}
860CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Sub(size_t N, word *C, const word *A, const word *B)
861{
862 AddPrologue
863
864 // now: eax = A, edi = B, edx = C, ecx = N
865 AS2( lea eax, [eax+4*ecx])
866 AS2( lea edi, [edi+4*ecx])
867 AS2( lea edx, [edx+4*ecx])
868
869 AS1( neg ecx) // ecx is negative index
870 AS2( pxor mm2, mm2)
871 ASJ( jz, 2, f)
872 AS2( test ecx, 2) // this clears carry flag
873 ASJ( jz, 0, f)
874 AS2( sub ecx, 2)
875 ASJ( jmp, 1, f)
876
877 ASL(0)
878 AS2( movd mm0, DWORD PTR [eax+4*ecx])
879 AS2( movd mm1, DWORD PTR [edi+4*ecx])
880 AS2( psubq mm0, mm1)
881 AS2( psubq mm0, mm2)
882 AS2( movd DWORD PTR [edx+4*ecx], mm0)
883 AS2( psrlq mm0, 63)
884
885 AS2( movd mm2, DWORD PTR [eax+4*ecx+4])
886 AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
887 AS2( psubq mm2, mm1)
888 AS2( psubq mm2, mm0)
889 AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
890 AS2( psrlq mm2, 63)
891
892 ASL(1)
893 AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
894 AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
895 AS2( psubq mm0, mm1)
896 AS2( psubq mm0, mm2)
897 AS2( movd DWORD PTR [edx+4*ecx+8], mm0)
898 AS2( psrlq mm0, 63)
899
900 AS2( movd mm2, DWORD PTR [eax+4*ecx+12])
901 AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
902 AS2( psubq mm2, mm1)
903 AS2( psubq mm2, mm0)
904 AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
905 AS2( psrlq mm2, 63)
906
907 AS2( add ecx, 4)
908 ASJ( jnz, 0, b)
909
910 ASL(2)
911 AS2( movd eax, mm2)
912 AS1( emms)
913
914 AddEpilogue
915
916 // http://github.com/weidai11/cryptopp/issues/340
917 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
918 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
919}
920#endif // CRYPTOPP_INTEGER_SSE2
921#else // CRYPTOPP_SSE2_ASM_AVAILABLE
922int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
923{
924 CRYPTOPP_ASSERT (N%2 == 0);
925
926 Declare2Words(u);
927 AssignWord(u, 0);
928 for (size_t i=0; i<N; i+=2)
929 {
930 AddWithCarry(u, A[i], B[i]);
931 C[i] = LowWord(u);
932 AddWithCarry(u, A[i+1], B[i+1]);
933 C[i+1] = LowWord(u);
934 }
935 return int(GetCarry(u));
936}
937
938int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
939{
940 CRYPTOPP_ASSERT (N%2 == 0);
941
942 Declare2Words(u);
943 AssignWord(u, 0);
944 for (size_t i=0; i<N; i+=2)
945 {
946 SubtractWithBorrow(u, A[i], B[i]);
947 C[i] = LowWord(u);
948 SubtractWithBorrow(u, A[i+1], B[i+1]);
949 C[i+1] = LowWord(u);
950 }
951 return int(GetBorrow(u));
952}
953#endif
954
955static word LinearMultiply(word *C, const word *AA, word B, size_t N)
956{
957 // http://github.com/weidai11/cryptopp/issues/188
958 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
959
960 word carry=0;
961 for(unsigned i=0; i<N; i++)
962 {
963 Declare2Words(p);
964 MultiplyWords(p, A[i], B);
965 Acc2WordsBy1(p, carry);
966 C[i] = LowWord(p);
967 carry = HighWord(p);
968 }
969 return carry;
970}
971
972#ifndef CRYPTOPP_DOXYGEN_PROCESSING
973
974#define Mul_2 \
975 Mul_Begin(2) \
976 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
977 Mul_End(1, 1)
978
979#define Mul_4 \
980 Mul_Begin(4) \
981 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
982 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
983 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
984 Mul_SaveAcc(3, 1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
985 Mul_SaveAcc(4, 2, 3) Mul_Acc(3, 2) \
986 Mul_End(5, 3)
987
988#define Mul_8 \
989 Mul_Begin(8) \
990 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
991 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
992 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
993 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
994 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
995 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
996 Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
997 Mul_SaveAcc(7, 1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
998 Mul_SaveAcc(8, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
999 Mul_SaveAcc(9, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1000 Mul_SaveAcc(10, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1001 Mul_SaveAcc(11, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1002 Mul_SaveAcc(12, 6, 7) Mul_Acc(7, 6) \
1003 Mul_End(13, 7)
1004
1005#define Mul_16 \
1006 Mul_Begin(16) \
1007 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1008 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1009 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1010 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1011 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1012 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1013 Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1014 Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1015 Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1016 Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1017 Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1018 Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1019 Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1020 Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1021 Mul_SaveAcc(14, 0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1022 Mul_SaveAcc(15, 1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1023 Mul_SaveAcc(16, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1024 Mul_SaveAcc(17, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1025 Mul_SaveAcc(18, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1026 Mul_SaveAcc(19, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1027 Mul_SaveAcc(20, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1028 Mul_SaveAcc(21, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1029 Mul_SaveAcc(22, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1030 Mul_SaveAcc(23, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1031 Mul_SaveAcc(24, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1032 Mul_SaveAcc(25, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1033 Mul_SaveAcc(26, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1034 Mul_SaveAcc(27, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1035 Mul_SaveAcc(28, 14, 15) Mul_Acc(15, 14) \
1036 Mul_End(29, 15)
1037
1038#define Squ_2 \
1039 Squ_Begin(2) \
1040 Squ_End(2)
1041
1042#define Squ_4 \
1043 Squ_Begin(4) \
1044 Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1045 Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1046 Squ_SaveAcc(3, 1, 3) Squ_Diag(2) \
1047 Squ_SaveAcc(4, 2, 3) Squ_NonDiag \
1048 Squ_End(4)
1049
1050#define Squ_8 \
1051 Squ_Begin(8) \
1052 Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1053 Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1054 Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1055 Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1056 Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1057 Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1058 Squ_SaveAcc(7, 1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1059 Squ_SaveAcc(8, 2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1060 Squ_SaveAcc(9, 3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1061 Squ_SaveAcc(10, 4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1062 Squ_SaveAcc(11, 5, 7) Squ_Diag(6) \
1063 Squ_SaveAcc(12, 6, 7) Squ_NonDiag \
1064 Squ_End(8)
1065
1066#define Squ_16 \
1067 Squ_Begin(16) \
1068 Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1069 Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1070 Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1071 Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1072 Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1073 Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1074 Squ_SaveAcc(7, 0, 8) Squ_Acc(1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1075 Squ_SaveAcc(8, 0, 9) Squ_Acc(1, 8) Squ_Acc(2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1076 Squ_SaveAcc(9, 0, 10) Squ_Acc(1, 9) Squ_Acc(2, 8) Squ_Acc(3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1077 Squ_SaveAcc(10, 0, 11) Squ_Acc(1, 10) Squ_Acc(2, 9) Squ_Acc(3, 8) Squ_Acc(4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1078 Squ_SaveAcc(11, 0, 12) Squ_Acc(1, 11) Squ_Acc(2, 10) Squ_Acc(3, 9) Squ_Acc(4, 8) Squ_Acc(5, 7) Squ_Diag(6) \
1079 Squ_SaveAcc(12, 0, 13) Squ_Acc(1, 12) Squ_Acc(2, 11) Squ_Acc(3, 10) Squ_Acc(4, 9) Squ_Acc(5, 8) Squ_Acc(6, 7) Squ_NonDiag \
1080 Squ_SaveAcc(13, 0, 14) Squ_Acc(1, 13) Squ_Acc(2, 12) Squ_Acc(3, 11) Squ_Acc(4, 10) Squ_Acc(5, 9) Squ_Acc(6, 8) Squ_Diag(7) \
1081 Squ_SaveAcc(14, 0, 15) Squ_Acc(1, 14) Squ_Acc(2, 13) Squ_Acc(3, 12) Squ_Acc(4, 11) Squ_Acc(5, 10) Squ_Acc(6, 9) Squ_Acc(7, 8) Squ_NonDiag \
1082 Squ_SaveAcc(15, 1, 15) Squ_Acc(2, 14) Squ_Acc(3, 13) Squ_Acc(4, 12) Squ_Acc(5, 11) Squ_Acc(6, 10) Squ_Acc(7, 9) Squ_Diag(8) \
1083 Squ_SaveAcc(16, 2, 15) Squ_Acc(3, 14) Squ_Acc(4, 13) Squ_Acc(5, 12) Squ_Acc(6, 11) Squ_Acc(7, 10) Squ_Acc(8, 9) Squ_NonDiag \
1084 Squ_SaveAcc(17, 3, 15) Squ_Acc(4, 14) Squ_Acc(5, 13) Squ_Acc(6, 12) Squ_Acc(7, 11) Squ_Acc(8, 10) Squ_Diag(9) \
1085 Squ_SaveAcc(18, 4, 15) Squ_Acc(5, 14) Squ_Acc(6, 13) Squ_Acc(7, 12) Squ_Acc(8, 11) Squ_Acc(9, 10) Squ_NonDiag \
1086 Squ_SaveAcc(19, 5, 15) Squ_Acc(6, 14) Squ_Acc(7, 13) Squ_Acc(8, 12) Squ_Acc(9, 11) Squ_Diag(10) \
1087 Squ_SaveAcc(20, 6, 15) Squ_Acc(7, 14) Squ_Acc(8, 13) Squ_Acc(9, 12) Squ_Acc(10, 11) Squ_NonDiag \
1088 Squ_SaveAcc(21, 7, 15) Squ_Acc(8, 14) Squ_Acc(9, 13) Squ_Acc(10, 12) Squ_Diag(11) \
1089 Squ_SaveAcc(22, 8, 15) Squ_Acc(9, 14) Squ_Acc(10, 13) Squ_Acc(11, 12) Squ_NonDiag \
1090 Squ_SaveAcc(23, 9, 15) Squ_Acc(10, 14) Squ_Acc(11, 13) Squ_Diag(12) \
1091 Squ_SaveAcc(24, 10, 15) Squ_Acc(11, 14) Squ_Acc(12, 13) Squ_NonDiag \
1092 Squ_SaveAcc(25, 11, 15) Squ_Acc(12, 14) Squ_Diag(13) \
1093 Squ_SaveAcc(26, 12, 15) Squ_Acc(13, 14) Squ_NonDiag \
1094 Squ_SaveAcc(27, 13, 15) Squ_Diag(14) \
1095 Squ_SaveAcc(28, 14, 15) Squ_NonDiag \
1096 Squ_End(16)
1097
1098#define Bot_2 \
1099 Mul_Begin(2) \
1100 Bot_SaveAcc(0, 0, 1) Bot_Acc(1, 0) \
1101 Bot_End(2)
1102
1103#define Bot_4 \
1104 Mul_Begin(4) \
1105 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1106 Mul_SaveAcc(1, 2, 0) Mul_Acc(1, 1) Mul_Acc(0, 2) \
1107 Bot_SaveAcc(2, 0, 3) Bot_Acc(1, 2) Bot_Acc(2, 1) Bot_Acc(3, 0) \
1108 Bot_End(4)
1109
1110#define Bot_8 \
1111 Mul_Begin(8) \
1112 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1113 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1114 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1115 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1116 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1117 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1118 Bot_SaveAcc(6, 0, 7) Bot_Acc(1, 6) Bot_Acc(2, 5) Bot_Acc(3, 4) Bot_Acc(4, 3) Bot_Acc(5, 2) Bot_Acc(6, 1) Bot_Acc(7, 0) \
1119 Bot_End(8)
1120
1121#define Bot_16 \
1122 Mul_Begin(16) \
1123 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1124 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1125 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1126 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1127 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1128 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1129 Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1130 Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1131 Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1132 Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1133 Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1134 Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1135 Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1136 Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1137 Bot_SaveAcc(14, 0, 15) Bot_Acc(1, 14) Bot_Acc(2, 13) Bot_Acc(3, 12) Bot_Acc(4, 11) Bot_Acc(5, 10) Bot_Acc(6, 9) Bot_Acc(7, 8) Bot_Acc(8, 7) Bot_Acc(9, 6) Bot_Acc(10, 5) Bot_Acc(11, 4) Bot_Acc(12, 3) Bot_Acc(13, 2) Bot_Acc(14, 1) Bot_Acc(15, 0) \
1138 Bot_End(16)
1139
1140#endif
1141
1142#if 0
1143#define Mul_Begin(n) \
1144 Declare2Words(p) \
1145 Declare2Words(c) \
1146 Declare2Words(d) \
1147 MultiplyWords(p, A[0], B[0]) \
1148 AssignWord(c, LowWord(p)) \
1149 AssignWord(d, HighWord(p))
1150
1151#define Mul_Acc(i, j) \
1152 MultiplyWords(p, A[i], B[j]) \
1153 Acc2WordsBy1(c, LowWord(p)) \
1154 Acc2WordsBy1(d, HighWord(p))
1155
1156#define Mul_SaveAcc(k, i, j) \
1157 R[k] = LowWord(c); \
1158 Add2WordsBy1(c, d, HighWord(c)) \
1159 MultiplyWords(p, A[i], B[j]) \
1160 AssignWord(d, HighWord(p)) \
1161 Acc2WordsBy1(c, LowWord(p))
1162
1163#define Mul_End(n) \
1164 R[2*n-3] = LowWord(c); \
1165 Acc2WordsBy1(d, HighWord(c)) \
1166 MultiplyWords(p, A[n-1], B[n-1])\
1167 Acc2WordsBy2(d, p) \
1168 R[2*n-2] = LowWord(d); \
1169 R[2*n-1] = HighWord(d);
1170
1171#define Bot_SaveAcc(k, i, j) \
1172 R[k] = LowWord(c); \
1173 word e = LowWord(d) + HighWord(c); \
1174 e += A[i] * B[j];
1175
1176#define Bot_Acc(i, j) \
1177 e += A[i] * B[j];
1178
1179#define Bot_End(n) \
1180 R[n-1] = e;
1181#else
1182#define Mul_Begin(n) \
1183 Declare2Words(p) \
1184 word c; \
1185 Declare2Words(d) \
1186 MultiplyWords(p, A[0], B[0]) \
1187 c = LowWord(p); \
1188 AssignWord(d, HighWord(p))
1189
1190#define Mul_Acc(i, j) \
1191 MulAcc(c, d, A[i], B[j])
1192
1193#define Mul_SaveAcc(k, i, j) \
1194 R[k] = c; \
1195 c = LowWord(d); \
1196 AssignWord(d, HighWord(d)) \
1197 MulAcc(c, d, A[i], B[j])
1198
1199#define Mul_End(k, i) \
1200 R[k] = c; \
1201 MultiplyWords(p, A[i], B[i]) \
1202 Acc2WordsBy2(p, d) \
1203 R[k+1] = LowWord(p); \
1204 R[k+2] = HighWord(p);
1205
1206#define Bot_SaveAcc(k, i, j) \
1207 R[k] = c; \
1208 c = LowWord(d); \
1209 c += A[i] * B[j];
1210
1211#define Bot_Acc(i, j) \
1212 c += A[i] * B[j];
1213
1214#define Bot_End(n) \
1215 R[n-1] = c;
1216#endif
1217
1218#define Squ_Begin(n) \
1219 Declare2Words(p) \
1220 word c; \
1221 Declare2Words(d) \
1222 Declare2Words(e) \
1223 MultiplyWords(p, A[0], A[0]) \
1224 R[0] = LowWord(p); \
1225 AssignWord(e, HighWord(p)) \
1226 MultiplyWords(p, A[0], A[1]) \
1227 c = LowWord(p); \
1228 AssignWord(d, HighWord(p)) \
1229 Squ_NonDiag \
1230
1231#define Squ_NonDiag \
1232 Double3Words(c, d)
1233
1234#define Squ_SaveAcc(k, i, j) \
1235 Acc3WordsBy2(c, d, e) \
1236 R[k] = c; \
1237 MultiplyWords(p, A[i], A[j]) \
1238 c = LowWord(p); \
1239 AssignWord(d, HighWord(p)) \
1240
1241#define Squ_Acc(i, j) \
1242 MulAcc(c, d, A[i], A[j])
1243
1244#define Squ_Diag(i) \
1245 Squ_NonDiag \
1246 MulAcc(c, d, A[i], A[i])
1247
1248#define Squ_End(n) \
1249 Acc3WordsBy2(c, d, e) \
1250 R[2*n-3] = c; \
1251 MultiplyWords(p, A[n-1], A[n-1])\
1252 Acc2WordsBy2(p, e) \
1253 R[2*n-2] = LowWord(p); \
1254 R[2*n-1] = HighWord(p);
1255
1256
1257void Baseline_Multiply2(word *R, const word *AA, const word *BB)
1258{
1259 // http://github.com/weidai11/cryptopp/issues/188
1260 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1261 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1262
1263 Mul_2
1264}
1265
1266void Baseline_Multiply4(word *R, const word *AA, const word *BB)
1267{
1268 // http://github.com/weidai11/cryptopp/issues/188
1269 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1270 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1271
1272 Mul_4
1273}
1274
1275void Baseline_Multiply8(word *R, const word *AA, const word *BB)
1276{
1277 // http://github.com/weidai11/cryptopp/issues/188
1278 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1279 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1280
1281 Mul_8
1282}
1283
1284void Baseline_Square2(word *R, const word *AA)
1285{
1286 // http://github.com/weidai11/cryptopp/issues/188
1287 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1288
1289 Squ_2
1290}
1291
1292void Baseline_Square4(word *R, const word *AA)
1293{
1294 // http://github.com/weidai11/cryptopp/issues/188
1295 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1296
1297 Squ_4
1298}
1299
1300void Baseline_Square8(word *R, const word *AA)
1301{
1302 // http://github.com/weidai11/cryptopp/issues/188
1303 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1304
1305 Squ_8
1306}
1307
1308void Baseline_MultiplyBottom2(word *R, const word *AA, const word *BB)
1309{
1310 // http://github.com/weidai11/cryptopp/issues/188
1311 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1312 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1313
1314 Bot_2
1315
1316// http://github.com/weidai11/cryptopp/issues/340
1317#if defined(TWO_64_BIT_WORDS)
1318 CRYPTOPP_UNUSED(d0); CRYPTOPP_UNUSED(d1);
1319#endif
1320}
1321
1322void Baseline_MultiplyBottom4(word *R, const word *AA, const word *BB)
1323{
1324 // http://github.com/weidai11/cryptopp/issues/188
1325 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1326 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1327
1328 Bot_4
1329}
1330
1331void Baseline_MultiplyBottom8(word *R, const word *AA, const word *BB)
1332{
1333 // http://github.com/weidai11/cryptopp/issues/188
1334 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1335 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1336
1337 Bot_8
1338}
1339
1340#define Top_Begin(n) \
1341 Declare2Words(p) \
1342 word c; \
1343 Declare2Words(d) \
1344 MultiplyWords(p, A[0], B[n-2]);\
1345 AssignWord(d, HighWord(p));
1346
1347#define Top_Acc(i, j) \
1348 MultiplyWords(p, A[i], B[j]);\
1349 Acc2WordsBy1(d, HighWord(p));
1350
1351#define Top_SaveAcc0(i, j) \
1352 c = LowWord(d); \
1353 AssignWord(d, HighWord(d)) \
1354 MulAcc(c, d, A[i], B[j])
1355
1356#define Top_SaveAcc1(i, j) \
1357 c = L<c; \
1358 Acc2WordsBy1(d, c); \
1359 c = LowWord(d); \
1360 AssignWord(d, HighWord(d)) \
1361 MulAcc(c, d, A[i], B[j])
1362
1363void Baseline_MultiplyTop2(word *R, const word *A, const word *B, word L)
1364{
1365 CRYPTOPP_UNUSED(L);
1366 word T[4];
1367 Baseline_Multiply2(T, A, B);
1368 R[0] = T[2];
1369 R[1] = T[3];
1370}
1371
1372void Baseline_MultiplyTop4(word *R, const word *AA, const word *BB, word L)
1373{
1374 // http://github.com/weidai11/cryptopp/issues/188
1375 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1376 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1377
1378 Top_Begin(4)
1379 Top_Acc(1, 1) Top_Acc(2, 0) \
1380 Top_SaveAcc0(0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1381 Top_SaveAcc1(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
1382 Mul_SaveAcc(0, 2, 3) Mul_Acc(3, 2) \
1383 Mul_End(1, 3)
1384}
1385
1386void Baseline_MultiplyTop8(word *R, const word *AA, const word *BB, word L)
1387{
1388 // http://github.com/weidai11/cryptopp/issues/188
1389 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1390 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1391
1392 Top_Begin(8)
1393 Top_Acc(1, 5) Top_Acc(2, 4) Top_Acc(3, 3) Top_Acc(4, 2) Top_Acc(5, 1) Top_Acc(6, 0) \
1394 Top_SaveAcc0(0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1395 Top_SaveAcc1(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
1396 Mul_SaveAcc(0, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
1397 Mul_SaveAcc(1, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1398 Mul_SaveAcc(2, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1399 Mul_SaveAcc(3, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1400 Mul_SaveAcc(4, 6, 7) Mul_Acc(7, 6) \
1401 Mul_End(5, 7)
1402}
1403
1404#if !CRYPTOPP_INTEGER_SSE2 // save memory by not compiling these functions when SSE2 is available
1405void Baseline_Multiply16(word *R, const word *AA, const word *BB)
1406{
1407 // http://github.com/weidai11/cryptopp/issues/188
1408 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1409 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1410
1411 Mul_16
1412}
1413
1414void Baseline_Square16(word *R, const word *AA)
1415{
1416 // http://github.com/weidai11/cryptopp/issues/188
1417 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1418
1419 Squ_16
1420}
1421
1422void Baseline_MultiplyBottom16(word *R, const word *AA, const word *BB)
1423{
1424 // http://github.com/weidai11/cryptopp/issues/188
1425 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1426 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1427
1428 Bot_16
1429}
1430
1431void Baseline_MultiplyTop16(word *R, const word *AA, const word *BB, word L)
1432{
1433 // http://github.com/weidai11/cryptopp/issues/188
1434 MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1435 MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1436
1437 Top_Begin(16)
1438 Top_Acc(1, 13) Top_Acc(2, 12) Top_Acc(3, 11) Top_Acc(4, 10) Top_Acc(5, 9) Top_Acc(6, 8) Top_Acc(7, 7) Top_Acc(8, 6) Top_Acc(9, 5) Top_Acc(10, 4) Top_Acc(11, 3) Top_Acc(12, 2) Top_Acc(13, 1) Top_Acc(14, 0) \
1439 Top_SaveAcc0(0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1440 Top_SaveAcc1(1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1441 Mul_SaveAcc(0, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1442 Mul_SaveAcc(1, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1443 Mul_SaveAcc(2, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1444 Mul_SaveAcc(3, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1445 Mul_SaveAcc(4, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1446 Mul_SaveAcc(5, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1447 Mul_SaveAcc(6, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1448 Mul_SaveAcc(7, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1449 Mul_SaveAcc(8, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1450 Mul_SaveAcc(9, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1451 Mul_SaveAcc(10, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1452 Mul_SaveAcc(11, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1453 Mul_SaveAcc(12, 14, 15) Mul_Acc(15, 14) \
1454 Mul_End(13, 15)
1455}
1456#endif
1457
1458// ********************************************************
1459
1460#if CRYPTOPP_INTEGER_SSE2
1461
1462CRYPTOPP_ALIGN_DATA(16)
1463CRYPTOPP_TABLE
1464const word32 s_maskLow16[4] = {
1465 0xffff,0xffff,0xffff,0xffff
1466};
1467
1468#undef Mul_Begin
1469#undef Mul_Acc
1470#undef Top_Begin
1471#undef Top_Acc
1472#undef Squ_Acc
1473#undef Squ_NonDiag
1474#undef Squ_Diag
1475#undef Squ_SaveAcc
1476#undef Squ_Begin
1477#undef Mul_SaveAcc
1478#undef Bot_Acc
1479#undef Bot_SaveAcc
1480#undef Bot_End
1481#undef Squ_End
1482#undef Mul_End
1483
1484#define SSE2_FinalSave(k) \
1485 AS2( psllq xmm5, 16) \
1486 AS2( paddq xmm4, xmm5) \
1487 AS2( movq QWORD PTR [ecx+8*(k)], xmm4)
1488
1489#define SSE2_SaveShift(k) \
1490 AS2( movq xmm0, xmm6) \
1491 AS2( punpckhqdq xmm6, xmm0) \
1492 AS2( movq xmm1, xmm7) \
1493 AS2( punpckhqdq xmm7, xmm1) \
1494 AS2( paddd xmm6, xmm0) \
1495 AS2( pslldq xmm6, 4) \
1496 AS2( paddd xmm7, xmm1) \
1497 AS2( paddd xmm4, xmm6) \
1498 AS2( pslldq xmm7, 4) \
1499 AS2( movq xmm6, xmm4) \
1500 AS2( paddd xmm5, xmm7) \
1501 AS2( movq xmm7, xmm5) \
1502 AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1503 AS2( psrlq xmm6, 16) \
1504 AS2( paddq xmm6, xmm7) \
1505 AS2( punpckhqdq xmm4, xmm0) \
1506 AS2( punpckhqdq xmm5, xmm0) \
1507 AS2( movq QWORD PTR [ecx+8*(k)+2], xmm6) \
1508 AS2( psrlq xmm6, 3*16) \
1509 AS2( paddd xmm4, xmm6) \
1510
1511#define Squ_SSE2_SaveShift(k) \
1512 AS2( movq xmm0, xmm6) \
1513 AS2( punpckhqdq xmm6, xmm0) \
1514 AS2( movq xmm1, xmm7) \
1515 AS2( punpckhqdq xmm7, xmm1) \
1516 AS2( paddd xmm6, xmm0) \
1517 AS2( pslldq xmm6, 4) \
1518 AS2( paddd xmm7, xmm1) \
1519 AS2( paddd xmm4, xmm6) \
1520 AS2( pslldq xmm7, 4) \
1521 AS2( movhlps xmm6, xmm4) \
1522 AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1523 AS2( paddd xmm5, xmm7) \
1524 AS2( movhps QWORD PTR [esp+12], xmm5)\
1525 AS2( psrlq xmm4, 16) \
1526 AS2( paddq xmm4, xmm5) \
1527 AS2( movq QWORD PTR [ecx+8*(k)+2], xmm4) \
1528 AS2( psrlq xmm4, 3*16) \
1529 AS2( paddd xmm4, xmm6) \
1530 AS2( movq QWORD PTR [esp+4], xmm4)\
1531
1532#define SSE2_FirstMultiply(i) \
1533 AS2( movdqa xmm7, [esi+(i)*16])\
1534 AS2( movdqa xmm5, [edi-(i)*16])\
1535 AS2( pmuludq xmm5, xmm7) \
1536 AS2( movdqa xmm4, [ebx])\
1537 AS2( movdqa xmm6, xmm4) \
1538 AS2( pand xmm4, xmm5) \
1539 AS2( psrld xmm5, 16) \
1540 AS2( pmuludq xmm7, [edx-(i)*16])\
1541 AS2( pand xmm6, xmm7) \
1542 AS2( psrld xmm7, 16)
1543
1544#define Squ_Begin(n) \
1545 SquPrologue \
1546 AS2( mov esi, esp)\
1547 AS2( and esp, 0xfffffff0)\
1548 AS2( lea edi, [esp-32*n])\
1549 AS2( sub esp, 32*n+16)\
1550 AS1( push esi)\
1551 AS2( mov esi, edi) \
1552 AS2( xor edx, edx) \
1553 ASL(1) \
1554 ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1555 ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1556 AS2( movdqa [edi+2*edx], xmm0) \
1557 AS2( psrlq xmm0, 32) \
1558 AS2( movdqa [edi+2*edx+16], xmm0) \
1559 AS2( movdqa [edi+16*n+2*edx], xmm1) \
1560 AS2( psrlq xmm1, 32) \
1561 AS2( movdqa [edi+16*n+2*edx+16], xmm1) \
1562 AS2( add edx, 16) \
1563 AS2( cmp edx, 8*(n)) \
1564 ASJ( jne, 1, b) \
1565 AS2( lea edx, [edi+16*n])\
1566 SSE2_FirstMultiply(0) \
1567
1568#define Squ_Acc(i) \
1569 ASL(LSqu##i) \
1570 AS2( movdqa xmm1, [esi+(i)*16]) \
1571 AS2( movdqa xmm0, [edi-(i)*16]) \
1572 AS2( movdqa xmm2, [ebx]) \
1573 AS2( pmuludq xmm0, xmm1) \
1574 AS2( pmuludq xmm1, [edx-(i)*16]) \
1575 AS2( movdqa xmm3, xmm2) \
1576 AS2( pand xmm2, xmm0) \
1577 AS2( psrld xmm0, 16) \
1578 AS2( paddd xmm4, xmm2) \
1579 AS2( paddd xmm5, xmm0) \
1580 AS2( pand xmm3, xmm1) \
1581 AS2( psrld xmm1, 16) \
1582 AS2( paddd xmm6, xmm3) \
1583 AS2( paddd xmm7, xmm1) \
1584
1585#define Squ_Acc1(i)
1586#define Squ_Acc2(i) ASC(call, LSqu##i)
1587#define Squ_Acc3(i) Squ_Acc2(i)
1588#define Squ_Acc4(i) Squ_Acc2(i)
1589#define Squ_Acc5(i) Squ_Acc2(i)
1590#define Squ_Acc6(i) Squ_Acc2(i)
1591#define Squ_Acc7(i) Squ_Acc2(i)
1592#define Squ_Acc8(i) Squ_Acc2(i)
1593
1594#define SSE2_End(E, n) \
1595 SSE2_SaveShift(2*(n)-3) \
1596 AS2( movdqa xmm7, [esi+16]) \
1597 AS2( movdqa xmm0, [edi]) \
1598 AS2( pmuludq xmm0, xmm7) \
1599 AS2( movdqa xmm2, [ebx]) \
1600 AS2( pmuludq xmm7, [edx]) \
1601 AS2( movdqa xmm6, xmm2) \
1602 AS2( pand xmm2, xmm0) \
1603 AS2( psrld xmm0, 16) \
1604 AS2( paddd xmm4, xmm2) \
1605 AS2( paddd xmm5, xmm0) \
1606 AS2( pand xmm6, xmm7) \
1607 AS2( psrld xmm7, 16) \
1608 SSE2_SaveShift(2*(n)-2) \
1609 SSE2_FinalSave(2*(n)-1) \
1610 AS1( pop esp)\
1611 E
1612
1613#define Squ_End(n) SSE2_End(SquEpilogue, n)
1614#define Mul_End(n) SSE2_End(MulEpilogue, n)
1615#define Top_End(n) SSE2_End(TopEpilogue, n)
1616
1617#define Squ_Column1(k, i) \
1618 Squ_SSE2_SaveShift(k) \
1619 AS2( add esi, 16) \
1620 SSE2_FirstMultiply(1)\
1621 Squ_Acc##i(i) \
1622 AS2( paddd xmm4, xmm4) \
1623 AS2( paddd xmm5, xmm5) \
1624 AS2( movdqa xmm3, [esi]) \
1625 AS2( movq xmm1, QWORD PTR [esi+8]) \
1626 AS2( pmuludq xmm1, xmm3) \
1627 AS2( pmuludq xmm3, xmm3) \
1628 AS2( movdqa xmm0, [ebx])\
1629 AS2( movdqa xmm2, xmm0) \
1630 AS2( pand xmm0, xmm1) \
1631 AS2( psrld xmm1, 16) \
1632 AS2( paddd xmm6, xmm0) \
1633 AS2( paddd xmm7, xmm1) \
1634 AS2( pand xmm2, xmm3) \
1635 AS2( psrld xmm3, 16) \
1636 AS2( paddd xmm6, xmm6) \
1637 AS2( paddd xmm7, xmm7) \
1638 AS2( paddd xmm4, xmm2) \
1639 AS2( paddd xmm5, xmm3) \
1640 AS2( movq xmm0, QWORD PTR [esp+4])\
1641 AS2( movq xmm1, QWORD PTR [esp+12])\
1642 AS2( paddd xmm4, xmm0)\
1643 AS2( paddd xmm5, xmm1)\
1644
1645#define Squ_Column0(k, i) \
1646 Squ_SSE2_SaveShift(k) \
1647 AS2( add edi, 16) \
1648 AS2( add edx, 16) \
1649 SSE2_FirstMultiply(1)\
1650 Squ_Acc##i(i) \
1651 AS2( paddd xmm6, xmm6) \
1652 AS2( paddd xmm7, xmm7) \
1653 AS2( paddd xmm4, xmm4) \
1654 AS2( paddd xmm5, xmm5) \
1655 AS2( movq xmm0, QWORD PTR [esp+4])\
1656 AS2( movq xmm1, QWORD PTR [esp+12])\
1657 AS2( paddd xmm4, xmm0)\
1658 AS2( paddd xmm5, xmm1)\
1659
1660#define SSE2_MulAdd45 \
1661 AS2( movdqa xmm7, [esi]) \
1662 AS2( movdqa xmm0, [edi]) \
1663 AS2( pmuludq xmm0, xmm7) \
1664 AS2( movdqa xmm2, [ebx]) \
1665 AS2( pmuludq xmm7, [edx]) \
1666 AS2( movdqa xmm6, xmm2) \
1667 AS2( pand xmm2, xmm0) \
1668 AS2( psrld xmm0, 16) \
1669 AS2( paddd xmm4, xmm2) \
1670 AS2( paddd xmm5, xmm0) \
1671 AS2( pand xmm6, xmm7) \
1672 AS2( psrld xmm7, 16)
1673
1674#define Mul_Begin(n) \
1675 MulPrologue \
1676 AS2( mov esi, esp)\
1677 AS2( and esp, 0xfffffff0)\
1678 AS2( sub esp, 48*n+16)\
1679 AS1( push esi)\
1680 AS2( xor edx, edx) \
1681 ASL(1) \
1682 ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1683 ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1684 ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1685 AS2( movdqa [esp+20+2*edx], xmm0) \
1686 AS2( psrlq xmm0, 32) \
1687 AS2( movdqa [esp+20+2*edx+16], xmm0) \
1688 AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1689 AS2( psrlq xmm1, 32) \
1690 AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1691 AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1692 AS2( psrlq xmm2, 32) \
1693 AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1694 AS2( add edx, 16) \
1695 AS2( cmp edx, 8*(n)) \
1696 ASJ( jne, 1, b) \
1697 AS2( lea edi, [esp+20])\
1698 AS2( lea edx, [esp+20+16*n])\
1699 AS2( lea esi, [esp+20+32*n])\
1700 SSE2_FirstMultiply(0) \
1701
1702#define Mul_Acc(i) \
1703 ASL(LMul##i) \
1704 AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1705 AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1706 AS2( movdqa xmm2, [ebx]) \
1707 AS2( pmuludq xmm0, xmm1) \
1708 AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1709 AS2( movdqa xmm3, xmm2) \
1710 AS2( pand xmm2, xmm0) \
1711 AS2( psrld xmm0, 16) \
1712 AS2( paddd xmm4, xmm2) \
1713 AS2( paddd xmm5, xmm0) \
1714 AS2( pand xmm3, xmm1) \
1715 AS2( psrld xmm1, 16) \
1716 AS2( paddd xmm6, xmm3) \
1717 AS2( paddd xmm7, xmm1) \
1718
1719#define Mul_Acc1(i)
1720#define Mul_Acc2(i) ASC(call, LMul##i)
1721#define Mul_Acc3(i) Mul_Acc2(i)
1722#define Mul_Acc4(i) Mul_Acc2(i)
1723#define Mul_Acc5(i) Mul_Acc2(i)
1724#define Mul_Acc6(i) Mul_Acc2(i)
1725#define Mul_Acc7(i) Mul_Acc2(i)
1726#define Mul_Acc8(i) Mul_Acc2(i)
1727#define Mul_Acc9(i) Mul_Acc2(i)
1728#define Mul_Acc10(i) Mul_Acc2(i)
1729#define Mul_Acc11(i) Mul_Acc2(i)
1730#define Mul_Acc12(i) Mul_Acc2(i)
1731#define Mul_Acc13(i) Mul_Acc2(i)
1732#define Mul_Acc14(i) Mul_Acc2(i)
1733#define Mul_Acc15(i) Mul_Acc2(i)
1734#define Mul_Acc16(i) Mul_Acc2(i)
1735
1736#define Mul_Column1(k, i) \
1737 SSE2_SaveShift(k) \
1738 AS2( add esi, 16) \
1739 SSE2_MulAdd45\
1740 Mul_Acc##i(i) \
1741
1742#define Mul_Column0(k, i) \
1743 SSE2_SaveShift(k) \
1744 AS2( add edi, 16) \
1745 AS2( add edx, 16) \
1746 SSE2_MulAdd45\
1747 Mul_Acc##i(i) \
1748
1749#define Bot_Acc(i) \
1750 AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1751 AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1752 AS2( pmuludq xmm0, xmm1) \
1753 AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1754 AS2( paddq xmm4, xmm0) \
1755 AS2( paddd xmm6, xmm1)
1756
1757#define Bot_SaveAcc(k) \
1758 SSE2_SaveShift(k) \
1759 AS2( add edi, 16) \
1760 AS2( add edx, 16) \
1761 AS2( movdqa xmm6, [esi]) \
1762 AS2( movdqa xmm0, [edi]) \
1763 AS2( pmuludq xmm0, xmm6) \
1764 AS2( paddq xmm4, xmm0) \
1765 AS2( psllq xmm5, 16) \
1766 AS2( paddq xmm4, xmm5) \
1767 AS2( pmuludq xmm6, [edx])
1768
1769#define Bot_End(n) \
1770 AS2( movhlps xmm7, xmm6) \
1771 AS2( paddd xmm6, xmm7) \
1772 AS2( psllq xmm6, 32) \
1773 AS2( paddd xmm4, xmm6) \
1774 AS2( movq QWORD PTR [ecx+8*((n)-1)], xmm4) \
1775 AS1( pop esp)\
1776 MulEpilogue
1777
1778#define Top_Begin(n) \
1779 TopPrologue \
1780 AS2( mov edx, esp)\
1781 AS2( and esp, 0xfffffff0)\
1782 AS2( sub esp, 48*n+16)\
1783 AS1( push edx)\
1784 AS2( xor edx, edx) \
1785 ASL(1) \
1786 ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1787 ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1788 ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1789 AS2( movdqa [esp+20+2*edx], xmm0) \
1790 AS2( psrlq xmm0, 32) \
1791 AS2( movdqa [esp+20+2*edx+16], xmm0) \
1792 AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1793 AS2( psrlq xmm1, 32) \
1794 AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1795 AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1796 AS2( psrlq xmm2, 32) \
1797 AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1798 AS2( add edx, 16) \
1799 AS2( cmp edx, 8*(n)) \
1800 ASJ( jne, 1, b) \
1801 AS2( mov eax, esi) \
1802 AS2( lea edi, [esp+20+00*n+16*(n/2-1)])\
1803 AS2( lea edx, [esp+20+16*n+16*(n/2-1)])\
1804 AS2( lea esi, [esp+20+32*n+16*(n/2-1)])\
1805 AS2( pxor xmm4, xmm4)\
1806 AS2( pxor xmm5, xmm5)
1807
1808#define Top_Acc(i) \
1809 AS2( movq xmm0, QWORD PTR [esi+i/2*(1-(i-2*(i/2))*2)*16+8]) \
1810 AS2( pmuludq xmm0, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1811 AS2( psrlq xmm0, 48) \
1812 AS2( paddd xmm5, xmm0)\
1813
1814#define Top_Column0(i) \
1815 AS2( psllq xmm5, 32) \
1816 AS2( add edi, 16) \
1817 AS2( add edx, 16) \
1818 SSE2_MulAdd45\
1819 Mul_Acc##i(i) \
1820
1821#define Top_Column1(i) \
1822 SSE2_SaveShift(0) \
1823 AS2( add esi, 16) \
1824 SSE2_MulAdd45\
1825 Mul_Acc##i(i) \
1826 AS2( shr eax, 16) \
1827 AS2( movd xmm0, eax)\
1828 AS2( movd xmm1, [ecx+4])\
1829 AS2( psrld xmm1, 16)\
1830 AS2( pcmpgtd xmm1, xmm0)\
1831 AS2( psrld xmm1, 31)\
1832 AS2( paddd xmm4, xmm1)\
1833
1834void SSE2_Square4(word *C, const word *A)
1835{
1836 Squ_Begin(2)
1837 Squ_Column0(0, 1)
1838 Squ_End(2)
1839}
1840
1841void SSE2_Square8(word *C, const word *A)
1842{
1843 Squ_Begin(4)
1844#ifndef __GNUC__
1845 ASJ( jmp, 0, f)
1846 Squ_Acc(2)
1847 AS1( ret) ASL(0)
1848#endif
1849 Squ_Column0(0, 1)
1850 Squ_Column1(1, 1)
1851 Squ_Column0(2, 2)
1852 Squ_Column1(3, 1)
1853 Squ_Column0(4, 1)
1854 Squ_End(4)
1855}
1856
1857void SSE2_Square16(word *C, const word *A)
1858{
1859 Squ_Begin(8)
1860#ifndef __GNUC__
1861 ASJ( jmp, 0, f)
1862 Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1863 AS1( ret) ASL(0)
1864#endif
1865 Squ_Column0(0, 1)
1866 Squ_Column1(1, 1)
1867 Squ_Column0(2, 2)
1868 Squ_Column1(3, 2)
1869 Squ_Column0(4, 3)
1870 Squ_Column1(5, 3)
1871 Squ_Column0(6, 4)
1872 Squ_Column1(7, 3)
1873 Squ_Column0(8, 3)
1874 Squ_Column1(9, 2)
1875 Squ_Column0(10, 2)
1876 Squ_Column1(11, 1)
1877 Squ_Column0(12, 1)
1878 Squ_End(8)
1879}
1880
1881void SSE2_Square32(word *C, const word *A)
1882{
1883 Squ_Begin(16)
1884 ASJ( jmp, 0, f)
1885 Squ_Acc(8) Squ_Acc(7) Squ_Acc(6) Squ_Acc(5) Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1886 AS1( ret) ASL(0)
1887 Squ_Column0(0, 1)
1888 Squ_Column1(1, 1)
1889 Squ_Column0(2, 2)
1890 Squ_Column1(3, 2)
1891 Squ_Column0(4, 3)
1892 Squ_Column1(5, 3)
1893 Squ_Column0(6, 4)
1894 Squ_Column1(7, 4)
1895 Squ_Column0(8, 5)
1896 Squ_Column1(9, 5)
1897 Squ_Column0(10, 6)
1898 Squ_Column1(11, 6)
1899 Squ_Column0(12, 7)
1900 Squ_Column1(13, 7)
1901 Squ_Column0(14, 8)
1902 Squ_Column1(15, 7)
1903 Squ_Column0(16, 7)
1904 Squ_Column1(17, 6)
1905 Squ_Column0(18, 6)
1906 Squ_Column1(19, 5)
1907 Squ_Column0(20, 5)
1908 Squ_Column1(21, 4)
1909 Squ_Column0(22, 4)
1910 Squ_Column1(23, 3)
1911 Squ_Column0(24, 3)
1912 Squ_Column1(25, 2)
1913 Squ_Column0(26, 2)
1914 Squ_Column1(27, 1)
1915 Squ_Column0(28, 1)
1916 Squ_End(16)
1917}
1918
1919void SSE2_Multiply4(word *C, const word *A, const word *B)
1920{
1921 Mul_Begin(2)
1922#ifndef __GNUC__
1923 ASJ( jmp, 0, f)
1924 Mul_Acc(2)
1925 AS1( ret) ASL(0)
1926#endif
1927 Mul_Column0(0, 2)
1928 Mul_End(2)
1929}
1930
1931void SSE2_Multiply8(word *C, const word *A, const word *B)
1932{
1933 Mul_Begin(4)
1934#ifndef __GNUC__
1935 ASJ( jmp, 0, f)
1936 Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1937 AS1( ret) ASL(0)
1938#endif
1939 Mul_Column0(0, 2)
1940 Mul_Column1(1, 3)
1941 Mul_Column0(2, 4)
1942 Mul_Column1(3, 3)
1943 Mul_Column0(4, 2)
1944 Mul_End(4)
1945}
1946
1947void SSE2_Multiply16(word *C, const word *A, const word *B)
1948{
1949 Mul_Begin(8)
1950#ifndef __GNUC__
1951 ASJ( jmp, 0, f)
1952 Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1953 AS1( ret) ASL(0)
1954#endif
1955 Mul_Column0(0, 2)
1956 Mul_Column1(1, 3)
1957 Mul_Column0(2, 4)
1958 Mul_Column1(3, 5)
1959 Mul_Column0(4, 6)
1960 Mul_Column1(5, 7)
1961 Mul_Column0(6, 8)
1962 Mul_Column1(7, 7)
1963 Mul_Column0(8, 6)
1964 Mul_Column1(9, 5)
1965 Mul_Column0(10, 4)
1966 Mul_Column1(11, 3)
1967 Mul_Column0(12, 2)
1968 Mul_End(8)
1969}
1970
1971void SSE2_Multiply32(word *C, const word *A, const word *B)
1972{
1973 Mul_Begin(16)
1974 ASJ( jmp, 0, f)
1975 Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1976 AS1( ret) ASL(0)
1977 Mul_Column0(0, 2)
1978 Mul_Column1(1, 3)
1979 Mul_Column0(2, 4)
1980 Mul_Column1(3, 5)
1981 Mul_Column0(4, 6)
1982 Mul_Column1(5, 7)
1983 Mul_Column0(6, 8)
1984 Mul_Column1(7, 9)
1985 Mul_Column0(8, 10)
1986 Mul_Column1(9, 11)
1987 Mul_Column0(10, 12)
1988 Mul_Column1(11, 13)
1989 Mul_Column0(12, 14)
1990 Mul_Column1(13, 15)
1991 Mul_Column0(14, 16)
1992 Mul_Column1(15, 15)
1993 Mul_Column0(16, 14)
1994 Mul_Column1(17, 13)
1995 Mul_Column0(18, 12)
1996 Mul_Column1(19, 11)
1997 Mul_Column0(20, 10)
1998 Mul_Column1(21, 9)
1999 Mul_Column0(22, 8)
2000 Mul_Column1(23, 7)
2001 Mul_Column0(24, 6)
2002 Mul_Column1(25, 5)
2003 Mul_Column0(26, 4)
2004 Mul_Column1(27, 3)
2005 Mul_Column0(28, 2)
2006 Mul_End(16)
2007}
2008
2009void SSE2_MultiplyBottom4(word *C, const word *A, const word *B)
2010{
2011 Mul_Begin(2)
2012 Bot_SaveAcc(0) Bot_Acc(2)
2013 Bot_End(2)
2014}
2015
2016void SSE2_MultiplyBottom8(word *C, const word *A, const word *B)
2017{
2018 Mul_Begin(4)
2019#ifndef __GNUC__
2020 ASJ( jmp, 0, f)
2021 Mul_Acc(3) Mul_Acc(2)
2022 AS1( ret) ASL(0)
2023#endif
2024 Mul_Column0(0, 2)
2025 Mul_Column1(1, 3)
2026 Bot_SaveAcc(2) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2027 Bot_End(4)
2028}
2029
2030void SSE2_MultiplyBottom16(word *C, const word *A, const word *B)
2031{
2032 Mul_Begin(8)
2033#ifndef __GNUC__
2034 ASJ( jmp, 0, f)
2035 Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2036 AS1( ret) ASL(0)
2037#endif
2038 Mul_Column0(0, 2)
2039 Mul_Column1(1, 3)
2040 Mul_Column0(2, 4)
2041 Mul_Column1(3, 5)
2042 Mul_Column0(4, 6)
2043 Mul_Column1(5, 7)
2044 Bot_SaveAcc(6) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2045 Bot_End(8)
2046}
2047
2048void SSE2_MultiplyBottom32(word *C, const word *A, const word *B)
2049{
2050 Mul_Begin(16)
2051#ifndef __GNUC__
2052 ASJ( jmp, 0, f)
2053 Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2054 AS1( ret) ASL(0)
2055#endif
2056 Mul_Column0(0, 2)
2057 Mul_Column1(1, 3)
2058 Mul_Column0(2, 4)
2059 Mul_Column1(3, 5)
2060 Mul_Column0(4, 6)
2061 Mul_Column1(5, 7)
2062 Mul_Column0(6, 8)
2063 Mul_Column1(7, 9)
2064 Mul_Column0(8, 10)
2065 Mul_Column1(9, 11)
2066 Mul_Column0(10, 12)
2067 Mul_Column1(11, 13)
2068 Mul_Column0(12, 14)
2069 Mul_Column1(13, 15)
2070 Bot_SaveAcc(14) Bot_Acc(16) Bot_Acc(15) Bot_Acc(14) Bot_Acc(13) Bot_Acc(12) Bot_Acc(11) Bot_Acc(10) Bot_Acc(9) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2071 Bot_End(16)
2072}
2073
2074void SSE2_MultiplyTop8(word *C, const word *A, const word *B, word L)
2075{
2076 Top_Begin(4)
2077 Top_Acc(3) Top_Acc(2) Top_Acc(1)
2078#ifndef __GNUC__
2079 ASJ( jmp, 0, f)
2080 Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2081 AS1( ret) ASL(0)
2082#endif
2083 Top_Column0(4)
2084 Top_Column1(3)
2085 Mul_Column0(0, 2)
2086 Top_End(2)
2087}
2088
2089void SSE2_MultiplyTop16(word *C, const word *A, const word *B, word L)
2090{
2091 Top_Begin(8)
2092 Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2093#ifndef __GNUC__
2094 ASJ( jmp, 0, f)
2095 Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2096 AS1( ret) ASL(0)
2097#endif
2098 Top_Column0(8)
2099 Top_Column1(7)
2100 Mul_Column0(0, 6)
2101 Mul_Column1(1, 5)
2102 Mul_Column0(2, 4)
2103 Mul_Column1(3, 3)
2104 Mul_Column0(4, 2)
2105 Top_End(4)
2106}
2107
2108void SSE2_MultiplyTop32(word *C, const word *A, const word *B, word L)
2109{
2110 Top_Begin(16)
2111 Top_Acc(15) Top_Acc(14) Top_Acc(13) Top_Acc(12) Top_Acc(11) Top_Acc(10) Top_Acc(9) Top_Acc(8) Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2112#ifndef __GNUC__
2113 ASJ( jmp, 0, f)
2114 Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2115 AS1( ret) ASL(0)
2116#endif
2117 Top_Column0(16)
2118 Top_Column1(15)
2119 Mul_Column0(0, 14)
2120 Mul_Column1(1, 13)
2121 Mul_Column0(2, 12)
2122 Mul_Column1(3, 11)
2123 Mul_Column0(4, 10)
2124 Mul_Column1(5, 9)
2125 Mul_Column0(6, 8)
2126 Mul_Column1(7, 7)
2127 Mul_Column0(8, 6)
2128 Mul_Column1(9, 5)
2129 Mul_Column0(10, 4)
2130 Mul_Column1(11, 3)
2131 Mul_Column0(12, 2)
2132 Top_End(8)
2133}
2134
2135#endif // #if CRYPTOPP_INTEGER_SSE2
2136
2137// ********************************************************
2138
2139typedef int (CRYPTOPP_FASTCALL * PAdd)(size_t N, word *C, const word *A, const word *B);
2140typedef void (* PMul)(word *C, const word *A, const word *B);
2141typedef void (* PSqu)(word *C, const word *A);
2142typedef void (* PMulTop)(word *C, const word *A, const word *B, word L);
2143
2144#if CRYPTOPP_INTEGER_SSE2
2145static PAdd s_pAdd = &Baseline_Add, s_pSub = &Baseline_Sub;
2146static size_t s_recursionLimit = 8;
2147#else
2148static const size_t s_recursionLimit = 16;
2149#endif // CRYPTOPP_INTEGER_SSE2
2150
2151static PMul s_pMul[9], s_pBot[9];
2152static PSqu s_pSqu[9];
2153static PMulTop s_pTop[9];
2154
2155void SetFunctionPointers()
2156{
2157 s_pMul[0] = &Baseline_Multiply2;
2158 s_pBot[0] = &Baseline_MultiplyBottom2;
2159 s_pSqu[0] = &Baseline_Square2;
2160 s_pTop[0] = &Baseline_MultiplyTop2;
2161 s_pTop[1] = &Baseline_MultiplyTop4;
2162
2163#if CRYPTOPP_INTEGER_SSE2
2164 if (HasSSE2())
2165 {
2166 if (IsP4())
2167 {
2168 s_pAdd = &SSE2_Add;
2169 s_pSub = &SSE2_Sub;
2170 }
2171
2172 s_recursionLimit = 32;
2173
2174 s_pMul[1] = &SSE2_Multiply4;
2175 s_pMul[2] = &SSE2_Multiply8;
2176 s_pMul[4] = &SSE2_Multiply16;
2177 s_pMul[8] = &SSE2_Multiply32;
2178
2179 s_pBot[1] = &SSE2_MultiplyBottom4;
2180 s_pBot[2] = &SSE2_MultiplyBottom8;
2181 s_pBot[4] = &SSE2_MultiplyBottom16;
2182 s_pBot[8] = &SSE2_MultiplyBottom32;
2183
2184 s_pSqu[1] = &SSE2_Square4;
2185 s_pSqu[2] = &SSE2_Square8;
2186 s_pSqu[4] = &SSE2_Square16;
2187 s_pSqu[8] = &SSE2_Square32;
2188
2189 s_pTop[2] = &SSE2_MultiplyTop8;
2190 s_pTop[4] = &SSE2_MultiplyTop16;
2191 s_pTop[8] = &SSE2_MultiplyTop32;
2192 }
2193 else
2194#endif // CRYPTOPP_INTEGER_SSE2
2195 {
2196 s_pMul[1] = &Baseline_Multiply4;
2197 s_pMul[2] = &Baseline_Multiply8;
2198
2199 s_pBot[1] = &Baseline_MultiplyBottom4;
2200 s_pBot[2] = &Baseline_MultiplyBottom8;
2201
2202 s_pSqu[1] = &Baseline_Square4;
2203 s_pSqu[2] = &Baseline_Square8;
2204
2205 s_pTop[2] = &Baseline_MultiplyTop8;
2206
2207#if !CRYPTOPP_INTEGER_SSE2
2208 s_pMul[4] = &Baseline_Multiply16;
2209 s_pBot[4] = &Baseline_MultiplyBottom16;
2210 s_pSqu[4] = &Baseline_Square16;
2211 s_pTop[4] = &Baseline_MultiplyTop16;
2212#endif // !CRYPTOPP_INTEGER_SSE2
2213 }
2214}
2215
2216inline int Add(word *C, const word *A, const word *B, size_t N)
2217{
2218#if CRYPTOPP_INTEGER_SSE2
2219 return s_pAdd(N, C, A, B);
2220#else
2221 return Baseline_Add(N, C, A, B);
2222#endif // CRYPTOPP_INTEGER_SSE2
2223}
2224
2225inline int Subtract(word *C, const word *A, const word *B, size_t N)
2226{
2227#if CRYPTOPP_INTEGER_SSE2
2228 return s_pSub(N, C, A, B);
2229#else
2230 return Baseline_Sub(N, C, A, B);
2231#endif // CRYPTOPP_INTEGER_SSE2
2232}
2233
2234// ********************************************************
2235
2236
2237#define A0 A
2238#define A1 (A+N2)
2239#define B0 B
2240#define B1 (B+N2)
2241
2242#define T0 T
2243#define T1 (T+N2)
2244#define T2 (T+N)
2245#define T3 (T+N+N2)
2246
2247#define R0 R
2248#define R1 (R+N2)
2249#define R2 (R+N)
2250#define R3 (R+N+N2)
2251
2252// R[2*N] - result = A*B
2253// T[2*N] - temporary work space
2254// A[N] --- multiplier
2255// B[N] --- multiplicant
2256
2257void RecursiveMultiply(word *R, word *T, const word *A, const word *B, size_t N)
2258{
2259 CRYPTOPP_ASSERT(N>=2 && N%2==0);
2260
2261 if (N <= s_recursionLimit)
2262 s_pMul[N/4](R, A, B);
2263 else
2264 {
2265 const size_t N2 = N/2;
2266
2267 size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2268 Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2269
2270 size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2271 Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2272
2273 RecursiveMultiply(R2, T2, A1, B1, N2);
2274 RecursiveMultiply(T0, T2, R0, R1, N2);
2275 RecursiveMultiply(R0, T2, A0, B0, N2);
2276
2277 // now T[01] holds (A1-A0)*(B0-B1), R[01] holds A0*B0, R[23] holds A1*B1
2278
2279 int c2 = Add(R2, R2, R1, N2);
2280 int c3 = c2;
2281 c2 += Add(R1, R2, R0, N2);
2282 c3 += Add(R2, R2, R3, N2);
2283
2284 if (AN2 == BN2)
2285 c3 -= Subtract(R1, R1, T0, N);
2286 else
2287 c3 += Add(R1, R1, T0, N);
2288
2289 c3 += Increment(R2, N2, c2);
2290 CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2291 Increment(R3, N2, c3);
2292 }
2293}
2294
2295// R[2*N] - result = A*A
2296// T[2*N] - temporary work space
2297// A[N] --- number to be squared
2298
2299void RecursiveSquare(word *R, word *T, const word *A, size_t N)
2300{
2301 CRYPTOPP_ASSERT(N && N%2==0);
2302
2303 if (N <= s_recursionLimit)
2304 s_pSqu[N/4](R, A);
2305 else
2306 {
2307 const size_t N2 = N/2;
2308
2309 RecursiveSquare(R0, T2, A0, N2);
2310 RecursiveSquare(R2, T2, A1, N2);
2311 RecursiveMultiply(T0, T2, A0, A1, N2);
2312
2313 int carry = Add(R1, R1, T0, N);
2314 carry += Add(R1, R1, T0, N);
2315 Increment(R3, N2, carry);
2316 }
2317}
2318
2319// R[N] - bottom half of A*B
2320// T[3*N/2] - temporary work space
2321// A[N] - multiplier
2322// B[N] - multiplicant
2323
2324void RecursiveMultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2325{
2326 CRYPTOPP_ASSERT(N>=2 && N%2==0);
2327
2328 if (N <= s_recursionLimit)
2329 s_pBot[N/4](R, A, B);
2330 else
2331 {
2332 const size_t N2 = N/2;
2333
2334 RecursiveMultiply(R, T, A0, B0, N2);
2335 RecursiveMultiplyBottom(T0, T1, A1, B0, N2);
2336 Add(R1, R1, T0, N2);
2337 RecursiveMultiplyBottom(T0, T1, A0, B1, N2);
2338 Add(R1, R1, T0, N2);
2339 }
2340}
2341
2342// R[N] --- upper half of A*B
2343// T[2*N] - temporary work space
2344// L[N] --- lower half of A*B
2345// A[N] --- multiplier
2346// B[N] --- multiplicant
2347
2348void MultiplyTop(word *R, word *T, const word *L, const word *A, const word *B, size_t N)
2349{
2350 CRYPTOPP_ASSERT(N>=2 && N%2==0);
2351
2352 if (N <= s_recursionLimit)
2353 s_pTop[N/4](R, A, B, L[N-1]);
2354 else
2355 {
2356 const size_t N2 = N/2;
2357
2358 size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2359 Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2360
2361 size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2362 Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2363
2364 RecursiveMultiply(T0, T2, R0, R1, N2);
2365 RecursiveMultiply(R0, T2, A1, B1, N2);
2366
2367 // now T[01] holds (A1-A0)*(B0-B1) = A1*B0+A0*B1-A1*B1-A0*B0, R[01] holds A1*B1
2368
2369 int t, c3;
2370 int c2 = Subtract(T2, L+N2, L, N2);
2371
2372 if (AN2 == BN2)
2373 {
2374 c2 -= Add(T2, T2, T0, N2);
2375 t = (Compare(T2, R0, N2) == -1);
2376 c3 = t - Subtract(T2, T2, T1, N2);
2377 }
2378 else
2379 {
2380 c2 += Subtract(T2, T2, T0, N2);
2381 t = (Compare(T2, R0, N2) == -1);
2382 c3 = t + Add(T2, T2, T1, N2);
2383 }
2384
2385 c2 += t;
2386 if (c2 >= 0)
2387 c3 += Increment(T2, N2, c2);
2388 else
2389 c3 -= Decrement(T2, N2, -c2);
2390 c3 += Add(R0, T2, R1, N2);
2391
2392 CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2393 Increment(R1, N2, c3);
2394 }
2395}
2396
2397inline void Multiply(word *R, word *T, const word *A, const word *B, size_t N)
2398{
2399 RecursiveMultiply(R, T, A, B, N);
2400}
2401
2402inline void Square(word *R, word *T, const word *A, size_t N)
2403{
2404 RecursiveSquare(R, T, A, N);
2405}
2406
2407inline void MultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2408{
2409 RecursiveMultiplyBottom(R, T, A, B, N);
2410}
2411
2412// R[NA+NB] - result = A*B
2413// T[NA+NB] - temporary work space
2414// A[NA] ---- multiplier
2415// B[NB] ---- multiplicant
2416
2417void AsymmetricMultiply(word *R, word *T, const word *A, size_t NA, const word *B, size_t NB)
2418{
2419 if (NA == NB)
2420 {
2421 // Profiling tells us the original second case was dominant, so it was promoted to the first If statement.
2422 // The code change occurred at Commit dc99266599a0e72d.
2423 if (A != B)
2424 Multiply(R, T, A, B, NA);
2425 else
2426 Square(R, T, A, NA);
2427
2428 return;
2429 }
2430
2431 if (NA > NB)
2432 {
2433 std::swap(A, B);
2434 std::swap(NA, NB);
2435 }
2436
2437 CRYPTOPP_ASSERT(NB % NA == 0);
2438
2439 if (NA==2 && !A[1])
2440 {
2441 // Profiling tells us the original Default case was dominant, so it was
2442 // promoted to the first Case statement. The code change occurred at
2443 // Commit dc99266599a0e72d.
2444 switch (A[0])
2445 {
2446 default:
2447 R[NB] = LinearMultiply(R, B, A[0], NB);
2448 R[NB+1] = 0;
2449 return;
2450 case 0:
2451 SetWords(R, 0, NB+2);
2452 return;
2453 case 1:
2454 CopyWords(R, B, NB);
2455 R[NB] = R[NB+1] = 0;
2456 return;
2457 }
2458 }
2459
2460 size_t i;
2461 if ((NB/NA)%2 == 0)
2462 {
2463 Multiply(R, T, A, B, NA);
2464 CopyWords(T+2*NA, R+NA, NA);
2465
2466 for (i=2*NA; i<NB; i+=2*NA)
2467 Multiply(T+NA+i, T, A, B+i, NA);
2468 for (i=NA; i<NB; i+=2*NA)
2469 Multiply(R+i, T, A, B+i, NA);
2470 }
2471 else
2472 {
2473 for (i=0; i<NB; i+=2*NA)
2474 Multiply(R+i, T, A, B+i, NA);
2475 for (i=NA; i<NB; i+=2*NA)
2476 Multiply(T+NA+i, T, A, B+i, NA);
2477 }
2478
2479 if (Add(R+NA, R+NA, T+2*NA, NB-NA))
2480 Increment(R+NB, NA);
2481}
2482
2483// R[N] ----- result = A inverse mod 2**(WORD_BITS*N)
2484// T[3*N/2] - temporary work space
2485// A[N] ----- an odd number as input
2486
2487void RecursiveInverseModPower2(word *R, word *T, const word *A, size_t N)
2488{
2489 // Profiling tells us the original Else was dominant, so it was
2490 // promoted to the first If statement. The code change occurred
2491 // at Commit dc99266599a0e72d.
2492 if (N!=2)
2493 {
2494 const size_t N2 = N/2;
2495 RecursiveInverseModPower2(R0, T0, A0, N2);
2496 T0[0] = 1;
2497 SetWords(T0+1, 0, N2-1);
2498 MultiplyTop(R1, T1, T0, R0, A0, N2);
2499 MultiplyBottom(T0, T1, R0, A1, N2);
2500 Add(T0, R1, T0, N2);
2501 TwosComplement(T0, N2);
2502 MultiplyBottom(R1, T1, R0, T0, N2);
2503 }
2504 else
2505 {
2506 T[0] = AtomicInverseModPower2(A[0]);
2507 T[1] = 0;
2508 s_pBot[0](T+2, T, A);
2509 TwosComplement(T+2, 2);
2510 Increment(T+2, 2, 2);
2511 s_pBot[0](R, T, T+2);
2512 }
2513}
2514
2515// R[N] --- result = X/(2**(WORD_BITS*N)) mod M
2516// T[3*N] - temporary work space
2517// X[2*N] - number to be reduced
2518// M[N] --- modulus
2519// U[N] --- multiplicative inverse of M mod 2**(WORD_BITS*N)
2520
2521void MontgomeryReduce(word *R, word *T, word *X, const word *M, const word *U, size_t N)
2522{
2523#if 1
2524 MultiplyBottom(R, T, X, U, N);
2525 MultiplyTop(T, T+N, X, R, M, N);
2526 word borrow = Subtract(T, X+N, T, N);
2527 // defend against timing attack by doing this Add even when not needed
2528 word carry = Add(T+N, T, M, N);
2529 CRYPTOPP_ASSERT(carry | !borrow);
2530 CRYPTOPP_UNUSED(carry), CRYPTOPP_UNUSED(borrow);
2531 CopyWords(R, T + ((0-borrow) & N), N);
2532#elif 0
2533 const word u = 0-U[0];
2534 Declare2Words(p)
2535 for (size_t i=0; i<N; i++)
2536 {
2537 const word t = u * X[i];
2538 word c = 0;
2539 for (size_t j=0; j<N; j+=2)
2540 {
2541 MultiplyWords(p, t, M[j]);
2542 Acc2WordsBy1(p, X[i+j]);
2543 Acc2WordsBy1(p, c);
2544 X[i+j] = LowWord(p);
2545 c = HighWord(p);
2546 MultiplyWords(p, t, M[j+1]);
2547 Acc2WordsBy1(p, X[i+j+1]);
2548 Acc2WordsBy1(p, c);
2549 X[i+j+1] = LowWord(p);
2550 c = HighWord(p);
2551 }
2552
2553 if (Increment(X+N+i, N-i, c))
2554 while (!Subtract(X+N, X+N, M, N)) {}
2555 }
2556
2557 memcpy(R, X+N, N*WORD_SIZE);
2558#else
2559 __m64 u = _mm_cvtsi32_si64(0-U[0]), p;
2560 for (size_t i=0; i<N; i++)
2561 {
2562 __m64 t = _mm_cvtsi32_si64(X[i]);
2563 t = _mm_mul_su32(t, u);
2564 __m64 c = _mm_setzero_si64();
2565 for (size_t j=0; j<N; j+=2)
2566 {
2567 p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j]));
2568 p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j]));
2569 c = _mm_add_si64(c, p);
2570 X[i+j] = _mm_cvtsi64_si32(c);
2571 c = _mm_srli_si64(c, 32);
2572 p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j+1]));
2573 p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j+1]));
2574 c = _mm_add_si64(c, p);
2575 X[i+j+1] = _mm_cvtsi64_si32(c);
2576 c = _mm_srli_si64(c, 32);
2577 }
2578
2579 if (Increment(X+N+i, N-i, _mm_cvtsi64_si32(c)))
2580 while (!Subtract(X+N, X+N, M, N)) {}
2581 }
2582
2583 memcpy(R, X+N, N*WORD_SIZE);
2584 _mm_empty();
2585#endif
2586}
2587
2588// R[N] --- result = X/(2**(WORD_BITS*N/2)) mod M
2589// T[2*N] - temporary work space
2590// X[2*N] - number to be reduced
2591// M[N] --- modulus
2592// U[N/2] - multiplicative inverse of M mod 2**(WORD_BITS*N/2)
2593// V[N] --- 2**(WORD_BITS*3*N/2) mod M
2594
2595void HalfMontgomeryReduce(word *R, word *T, const word *X, const word *M, const word *U, const word *V, size_t N)
2596{
2597 CRYPTOPP_ASSERT(N%2==0 && N>=4);
2598
2599#define M0 M
2600#define M1 (M+N2)
2601#define V0 V
2602#define V1 (V+N2)
2603
2604#define X0 X
2605#define X1 (X+N2)
2606#define X2 (X+N)
2607#define X3 (X+N+N2)
2608
2609 const size_t N2 = N/2;
2610 Multiply(T0, T2, V0, X3, N2);
2611 int c2 = Add(T0, T0, X0, N);
2612 MultiplyBottom(T3, T2, T0, U, N2);
2613 MultiplyTop(T2, R, T0, T3, M0, N2);
2614 c2 -= Subtract(T2, T1, T2, N2);
2615 Multiply(T0, R, T3, M1, N2);
2616 c2 -= Subtract(T0, T2, T0, N2);
2617 int c3 = -(int)Subtract(T1, X2, T1, N2);
2618 Multiply(R0, T2, V1, X3, N2);
2619 c3 += Add(R, R, T, N);
2620
2621 if (c2>0)
2622 c3 += Increment(R1, N2);
2623 else if (c2<0)
2624 c3 -= Decrement(R1, N2, -c2);
2625
2626 CRYPTOPP_ASSERT(c3>=-1 && c3<=1);
2627 if (c3>0)
2628 Subtract(R, R, M, N);
2629 else if (c3<0)
2630 Add(R, R, M, N);
2631
2632#undef M0
2633#undef M1
2634#undef V0
2635#undef V1
2636
2637#undef X0
2638#undef X1
2639#undef X2
2640#undef X3
2641}
2642
2643#undef A0
2644#undef A1
2645#undef B0
2646#undef B1
2647
2648#undef T0
2649#undef T1
2650#undef T2
2651#undef T3
2652
2653#undef R0
2654#undef R1
2655#undef R2
2656#undef R3
2657
2658/*
2659// do a 3 word by 2 word divide, returns quotient and leaves remainder in A
2660static word SubatomicDivide(word *A, word B0, word B1)
2661{
2662 // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a word
2663 CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
2664
2665 // estimate the quotient: do a 2 word by 1 word divide
2666 word Q;
2667 if (B1+1 == 0)
2668 Q = A[2];
2669 else
2670 Q = DWord(A[1], A[2]).DividedBy(B1+1);
2671
2672 // now subtract Q*B from A
2673 DWord p = DWord::Multiply(B0, Q);
2674 DWord u = (DWord) A[0] - p.GetLowHalf();
2675 A[0] = u.GetLowHalf();
2676 u = (DWord) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - DWord::Multiply(B1, Q);
2677 A[1] = u.GetLowHalf();
2678 A[2] += u.GetHighHalf();
2679
2680 // Q <= actual quotient, so fix it
2681 while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
2682 {
2683 u = (DWord) A[0] - B0;
2684 A[0] = u.GetLowHalf();
2685 u = (DWord) A[1] - B1 - u.GetHighHalfAsBorrow();
2686 A[1] = u.GetLowHalf();
2687 A[2] += u.GetHighHalf();
2688 Q++;
2689 CRYPTOPP_ASSERT(Q); // shouldn't overflow
2690 }
2691
2692 return Q;
2693}
2694
2695// do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
2696static inline void AtomicDivide(word *Q, const word *A, const word *B)
2697{
2698 if (!B[0] && !B[1]) // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
2699 {
2700 Q[0] = A[2];
2701 Q[1] = A[3];
2702 }
2703 else
2704 {
2705 word T[4];
2706 T[0] = A[0]; T[1] = A[1]; T[2] = A[2]; T[3] = A[3];
2707 Q[1] = SubatomicDivide(T+1, B[0], B[1]);
2708 Q[0] = SubatomicDivide(T, B[0], B[1]);
2709
2710#if defined(CRYPTOPP_DEBUG)
2711 // multiply quotient and divisor and add remainder, make sure it equals dividend
2712 CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2713 word P[4];
2714 LowLevel::Multiply2(P, Q, B);
2715 Add(P, P, T, 4);
2716 CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2717#endif
2718 }
2719}
2720*/
2721
2722static inline void AtomicDivide(word *Q, const word *A, const word *B)
2723{
2724 word T[4];
2725 DWord q = DivideFourWordsByTwo<word, DWord>(T, DWord(A[0], A[1]), DWord(A[2], A[3]), DWord(B[0], B[1]));
2726 Q[0] = q.GetLowHalf();
2727 Q[1] = q.GetHighHalf();
2728
2729#if defined(CRYPTOPP_DEBUG)
2730 if (B[0] || B[1])
2731 {
2732 // multiply quotient and divisor and add remainder, make sure it equals dividend
2733 CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2734 word P[4];
2735 s_pMul[0](P, Q, B);
2736 Add(P, P, T, 4);
2737 CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2738 }
2739#endif
2740}
2741
2742// for use by Divide(), corrects the underestimated quotient {Q1,Q0}
2743static void CorrectQuotientEstimate(word *R, word *T, word *Q, const word *B, size_t N)
2744{
2745 CRYPTOPP_ASSERT(N && N%2==0);
2746
2747 AsymmetricMultiply(T, T+N+2, Q, 2, B, N);
2748
2749 word borrow = Subtract(R, R, T, N+2);
2750 CRYPTOPP_ASSERT(!borrow && !R[N+1]);
2751 CRYPTOPP_UNUSED(borrow);
2752
2753 while (R[N] || Compare(R, B, N) >= 0)
2754 {
2755 R[N] -= Subtract(R, R, B, N);
2756 Q[1] += (++Q[0]==0);
2757 CRYPTOPP_ASSERT(Q[0] || Q[1]); // no overflow
2758 }
2759}
2760
2761// R[NB] -------- remainder = A%B
2762// Q[NA-NB+2] --- quotient = A/B
2763// T[NA+3*(NB+2)] - temp work space
2764// A[NA] -------- dividend
2765// B[NB] -------- divisor
2766
2767void Divide(word *R, word *Q, word *T, const word *A, size_t NA, const word *B, size_t NB)
2768{
2769 CRYPTOPP_ASSERT(NA && NB && NA%2==0 && NB%2==0);
2770 CRYPTOPP_ASSERT(B[NB-1] || B[NB-2]);
2771 CRYPTOPP_ASSERT(NB <= NA);
2772
2773 // set up temporary work space
2774 word *const TA=T;
2775 word *const TB=T+NA+2;
2776 word *const TP=T+NA+2+NB;
2777
2778 // copy B into TB and normalize it so that TB has highest bit set to 1
2779 unsigned shiftWords = (B[NB-1]==0);
2780 TB[0] = TB[NB-1] = 0;
2781 CopyWords(TB+shiftWords, B, NB-shiftWords);
2782 unsigned shiftBits = WORD_BITS - BitPrecision(TB[NB-1]);
2783 CRYPTOPP_ASSERT(shiftBits < WORD_BITS);
2784 ShiftWordsLeftByBits(TB, NB, shiftBits);
2785
2786 // copy A into TA and normalize it
2787 TA[0] = TA[NA] = TA[NA+1] = 0;
2788 CopyWords(TA+shiftWords, A, NA);
2789 ShiftWordsLeftByBits(TA, NA+2, shiftBits);
2790
2791 if (TA[NA+1]==0 && TA[NA] <= 1)
2792 {
2793 Q[NA-NB+1] = Q[NA-NB] = 0;
2794 while (TA[NA] || Compare(TA+NA-NB, TB, NB) >= 0)
2795 {
2796 TA[NA] -= Subtract(TA+NA-NB, TA+NA-NB, TB, NB);
2797 ++Q[NA-NB];
2798 }
2799 }
2800 else
2801 {
2802 NA+=2;
2803 CRYPTOPP_ASSERT(Compare(TA+NA-NB, TB, NB) < 0);
2804 }
2805
2806 word BT[2];
2807 BT[0] = TB[NB-2] + 1;
2808 BT[1] = TB[NB-1] + (BT[0]==0);
2809
2810 // start reducing TA mod TB, 2 words at a time
2811 for (size_t i=NA-2; i>=NB; i-=2)
2812 {
2813 AtomicDivide(Q+i-NB, TA+i-2, BT);
2814 CorrectQuotientEstimate(TA+i-NB, TP, Q+i-NB, TB, NB);
2815 }
2816
2817 // copy TA into R, and denormalize it
2818 CopyWords(R, TA+shiftWords, NB);
2819 ShiftWordsRightByBits(R, NB, shiftBits);
2820}
2821
2822static inline size_t EvenWordCount(const word *X, size_t N)
2823{
2824 while (N && X[N-2]==0 && X[N-1]==0)
2825 N-=2;
2826 return N;
2827}
2828
2829// return k
2830// R[N] --- result = A^(-1) * 2^k mod M
2831// T[4*N] - temporary work space
2832// A[NA] -- number to take inverse of
2833// M[N] --- modulus
2834
2835unsigned int AlmostInverse(word *R, word *T, const word *A, size_t NA, const word *M, size_t N)
2836{
2837 CRYPTOPP_ASSERT(NA<=N && N && N%2==0);
2838
2839 word *b = T;
2840 word *c = T+N;
2841 word *f = T+2*N;
2842 word *g = T+3*N;
2843 size_t bcLen=2, fgLen=EvenWordCount(M, N);
2844 unsigned int k=0;
2845 bool s=false;
2846
2847 SetWords(T, 0, 3*N);
2848 b[0]=1;
2849 CopyWords(f, A, NA);
2850 CopyWords(g, M, N);
2851
2852 while (1)
2853 {
2854 word t=f[0];
2855 while (!t)
2856 {
2857 if (EvenWordCount(f, fgLen)==0)
2858 {
2859 SetWords(R, 0, N);
2860 return 0;
2861 }
2862
2863 ShiftWordsRightByWords(f, fgLen, 1);
2864 bcLen += 2 * (c[bcLen-1] != 0);
2865 CRYPTOPP_ASSERT(bcLen <= N);
2866 ShiftWordsLeftByWords(c, bcLen, 1);
2867 k+=WORD_BITS;
2868 t=f[0];
2869 }
2870
2871 // t must be non-0; otherwise undefined.
2872 unsigned int i = TrailingZeros(t);
2873 t >>= i;
2874 k += i;
2875
2876 if (t==1 && f[1]==0 && EvenWordCount(f+2, fgLen-2)==0)
2877 {
2878 if (s)
2879 Subtract(R, M, b, N);
2880 else
2881 CopyWords(R, b, N);
2882 return k;
2883 }
2884
2885 ShiftWordsRightByBits(f, fgLen, i);
2886 t = ShiftWordsLeftByBits(c, bcLen, i);
2887 c[bcLen] += t;
2888 bcLen += 2 * (t!=0);
2889 CRYPTOPP_ASSERT(bcLen <= N);
2890
2891 bool swap = Compare(f, g, fgLen)==-1;
2892 ConditionalSwapPointers(swap, f, g);
2893 ConditionalSwapPointers(swap, b, c);
2894 s ^= swap;
2895
2896 fgLen -= 2 * !(f[fgLen-2] | f[fgLen-1]);
2897
2898 Subtract(f, f, g, fgLen);
2899 t = Add(b, b, c, bcLen);
2900 b[bcLen] += t;
2901 bcLen += 2*t;
2902 CRYPTOPP_ASSERT(bcLen <= N);
2903 }
2904}
2905
2906// R[N] - result = A/(2^k) mod M
2907// A[N] - input
2908// M[N] - modulus
2909
2910void DivideByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2911{
2912 CopyWords(R, A, N);
2913
2914 while (k--)
2915 {
2916 if (R[0]%2==0)
2917 ShiftWordsRightByBits(R, N, 1);
2918 else
2919 {
2920 word carry = Add(R, R, M, N);
2921 ShiftWordsRightByBits(R, N, 1);
2922 R[N-1] += carry<<(WORD_BITS-1);
2923 }
2924 }
2925}
2926
2927// R[N] - result = A*(2^k) mod M
2928// A[N] - input
2929// M[N] - modulus
2930
2931void MultiplyByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2932{
2933 CopyWords(R, A, N);
2934
2935 while (k--)
2936 if (ShiftWordsLeftByBits(R, N, 1) || Compare(R, M, N)>=0)
2937 Subtract(R, R, M, N);
2938}
2939
2940// ******************************************************************
2941
2942static const unsigned int RoundupSizeTable[] = {2, 2, 2, 4, 4, 8, 8, 8, 8};
2943
2944static inline size_t RoundupSize(size_t n)
2945{
2946 if (n<=8)
2947 return RoundupSizeTable[n];
2948 else if (n<=16)
2949 return 16;
2950 else if (n<=32)
2951 return 32;
2952 else if (n<=64)
2953 return 64;
2954 else
2955 return size_t(1) << BitPrecision(n-1);
2956}
2957
2959 : reg(2), sign(POSITIVE)
2960{
2961 reg[0] = reg[1] = 0;
2962}
2963
2965 : reg(RoundupSize(t.WordCount())), sign(t.sign)
2966{
2967 CopyWords(reg, t.reg, reg.size());
2968}
2969
2970Integer::Integer(Sign s, lword value)
2971 : reg(2), sign(s)
2972{
2973 reg[0] = word(value);
2974 reg[1] = word(SafeRightShift<WORD_BITS>(value));
2975}
2976
2977Integer::Integer(signed long value)
2978 : reg(2)
2979{
2980 if (value >= 0)
2981 sign = POSITIVE;
2982 else
2983 {
2984 sign = NEGATIVE;
2985 value = -value;
2986 }
2987 reg[0] = word(value);
2988 reg[1] = word(SafeRightShift<WORD_BITS>((unsigned long)value));
2989}
2990
2991Integer::Integer(Sign s, word high, word low)
2992 : reg(2), sign(s)
2993{
2994 reg[0] = low;
2995 reg[1] = high;
2996}
2997
2999{
3000 if (ByteCount() > sizeof(long))
3001 return false;
3002
3003 unsigned long value = (unsigned long)reg[0];
3004 value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3005
3006 if (sign==POSITIVE)
3007 return (signed long)value >= 0;
3008 else
3009 return -(signed long)value < 0;
3010}
3011
3012signed long Integer::ConvertToLong() const
3013{
3015
3016 unsigned long value = (unsigned long)reg[0];
3017 value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3018 return sign==POSITIVE ? value : -(signed long)value;
3019}
3020
3021Integer::Integer(BufferedTransformation &encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3022{
3024
3025 if (o != LITTLE_ENDIAN_ORDER)
3026 {
3027 Decode(encodedInteger, byteCount, s);
3028 }
3029 else
3030 {
3031 SecByteBlock block(byteCount);
3032 encodedInteger.Get(block, block.size());
3033 std::reverse(block.begin(), block.begin()+block.size());
3034
3035 Decode(block.begin(), block.size(), s);
3036 }
3037}
3038
3039Integer::Integer(const byte *encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3040{
3041 CRYPTOPP_ASSERT(encodedInteger && byteCount); // NULL buffer
3043
3044 if (o != LITTLE_ENDIAN_ORDER)
3045 {
3046 Decode(encodedInteger, byteCount, s);
3047 }
3048 else
3049 {
3050 SecByteBlock block(byteCount);
3051#if (_MSC_VER >= 1500)
3052 std::reverse_copy(encodedInteger, encodedInteger+byteCount,
3053 stdext::make_checked_array_iterator(block.begin(), block.size()));
3054#else
3055 std::reverse_copy(encodedInteger, encodedInteger+byteCount, block.begin());
3056#endif
3057 Decode(block.begin(), block.size(), s);
3058 return;
3059 }
3060}
3061
3063{
3064 // Make explicit call to avoid virtual-dispatch findings in ctor
3066}
3067
3069{
3070 Randomize(rng, bitcount);
3071}
3072
3073Integer::Integer(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3074{
3075 if (!Randomize(rng, min, max, rnType, equiv, mod))
3077}
3078
3080{
3081 Integer r((word)0, BitsToWords(e+1));
3082 r.SetBit(e);
3083 return r;
3084}
3085
3087{
3088 return IsNegative() ? false : (reg[0]==0 && WordCount()==0);
3089}
3090
3092{
3093 if (this != &t)
3094 {
3095 if (reg.size() != t.reg.size() || t.reg[t.reg.size()/2] == 0)
3096 reg.New(RoundupSize(t.WordCount()));
3097 CopyWords(reg, t.reg, reg.size());
3098 sign = t.sign;
3099 }
3100 return *this;
3101}
3102
3103bool Integer::GetBit(size_t n) const
3104{
3105 // Profiling tells us the original Else was dominant, so it was
3106 // promoted to the first If statement. The code change occurred
3107 // at Commit dc99266599a0e72d.
3108 if (n/WORD_BITS < reg.size())
3109 return bool((reg[n/WORD_BITS] >> (n % WORD_BITS)) & 1);
3110 else
3111 return 0;
3112}
3113
3114void Integer::SetBit(size_t n, bool value)
3115{
3116 if (value)
3117 {
3118 reg.CleanGrow(RoundupSize(BitsToWords(n+1)));
3119 reg[n/WORD_BITS] |= (word(1) << (n%WORD_BITS));
3120 }
3121 else
3122 {
3123 if (n/WORD_BITS < reg.size())
3124 reg[n/WORD_BITS] &= ~(word(1) << (n%WORD_BITS));
3125 }
3126}
3127
3128byte Integer::GetByte(size_t n) const
3129{
3130 // Profiling tells us the original Else was dominant, so it was
3131 // promoted to the first If statement. The code change occurred
3132 // at Commit dc99266599a0e72d.
3133 if (n/WORD_SIZE < reg.size())
3134 return byte(reg[n/WORD_SIZE] >> ((n%WORD_SIZE)*8));
3135 else
3136 return 0;
3137}
3138
3139void Integer::SetByte(size_t n, byte value)
3140{
3141 reg.CleanGrow(RoundupSize(BytesToWords(n+1)));
3142 reg[n/WORD_SIZE] &= ~(word(0xff) << 8*(n%WORD_SIZE));
3143 reg[n/WORD_SIZE] |= (word(value) << 8*(n%WORD_SIZE));
3144}
3145
3146lword Integer::GetBits(size_t i, size_t n) const
3147{
3148 lword v = 0;
3149 CRYPTOPP_ASSERT(n <= sizeof(v)*8);
3150 for (unsigned int j=0; j<n; j++)
3151 v |= lword(GetBit(i+j)) << j;
3152 return v;
3153}
3154
3156{
3157 Integer result(*this);
3158 result.Negate();
3159 return result;
3160}
3161
3163{
3164 Integer result(*this);
3165 result.sign = POSITIVE;
3166 return result;
3167}
3168
3170{
3171 reg.swap(a.reg);
3172 std::swap(sign, a.sign);
3173}
3174
3175Integer::Integer(word value, size_t length)
3176 : reg(RoundupSize(length)), sign(POSITIVE)
3177{
3178 reg[0] = value;
3179 SetWords(reg+1, 0, reg.size()-1);
3180}
3181
3182template <class T>
3183static Integer StringToInteger(const T *str, ByteOrder order)
3184{
3186
3187 int radix, sign = 1;
3188 // GCC workaround
3189 // std::char_traits<wchar_t>::length() not defined in GCC 3.2 and STLport 4.5.3
3190 unsigned int length;
3191 for (length = 0; str[length] != 0; length++) {}
3192
3193 Integer v;
3194
3195 if (length == 0)
3196 return Integer::Zero();
3197
3198 switch (str[length-1])
3199 {
3200 case 'h':
3201 case 'H':
3202 radix=16;
3203 break;
3204 case 'o':
3205 case 'O':
3206 radix=8;
3207 break;
3208 case 'b':
3209 case 'B':
3210 radix=2;
3211 break;
3212 default:
3213 radix=10;
3214 }
3215
3216 // 'str' is of length 1 or more
3217 if (str[0] == '-')
3218 {
3219 sign = -1;
3220 str += 1, length -= 1;
3221 }
3222
3223 if (length > 2 && str[0] == '0' && (str[1] == 'x' || str[1] == 'X'))
3224 {
3225 radix = 16;
3226 str += 2, length -= 2;
3227 }
3228
3229 if (order == BIG_ENDIAN_ORDER)
3230 {
3231 for (unsigned int i=0; i<length; i++)
3232 {
3233 int digit, ch = static_cast<int>(str[i]);
3234
3235 // Profiling showd the second and third Else needed to be swapped
3236 // The code change occurred at Commit dc99266599a0e72d.
3237 if (ch >= '0' && ch <= '9')
3238 digit = ch - '0';
3239 else if (ch >= 'a' && ch <= 'f')
3240 digit = ch - 'a' + 10;
3241 else if (ch >= 'A' && ch <= 'F')
3242 digit = ch - 'A' + 10;
3243 else
3244 digit = radix;
3245
3246 if (digit < radix)
3247 {
3248 v *= radix;
3249 v += digit;
3250 }
3251 }
3252 }
3253 else if (radix == 16 && order == LITTLE_ENDIAN_ORDER)
3254 {
3255 // Nibble high, low and count
3256 unsigned int nh = 0, nl = 0, nc = 0;
3257 Integer position(Integer::One());
3258
3259 for (unsigned int i=0; i<length; i++)
3260 {
3261 int digit, ch = static_cast<int>(str[i]);
3262
3263 if (ch >= '0' && ch <= '9')
3264 digit = ch - '0';
3265 else if (ch >= 'a' && ch <= 'f')
3266 digit = ch - 'a' + 10;
3267 else if (ch >= 'A' && ch <= 'F')
3268 digit = ch - 'A' + 10;
3269 else
3270 digit = radix;
3271
3272 if (digit < radix)
3273 {
3274 if (nc++ == 0)
3275 nh = digit;
3276 else
3277 nl = digit;
3278
3279 if (nc == 2)
3280 {
3281 v += position * (nh << 4 | nl);
3282 nc = 0, position <<= 8;
3283 }
3284 }
3285 }
3286
3287 if (nc == 1)
3288 v += nh * position;
3289 }
3290 else // LITTLE_ENDIAN_ORDER && radix != 16
3291 {
3292 for (int i=static_cast<int>(length)-1; i>=0; i--)
3293 {
3294 int digit, ch = static_cast<int>(str[i]);
3295
3296 if (ch >= '0' && ch <= '9')
3297 digit = ch - '0';
3298 else if (ch >= 'a' && ch <= 'f')
3299 digit = ch - 'a' + 10;
3300 else if (ch >= 'A' && ch <= 'F')
3301 digit = ch - 'A' + 10;
3302 else
3303 digit = radix;
3304
3305 if (digit < radix)
3306 {
3307 v *= radix;
3308 v += digit;
3309 }
3310 }
3311 }
3312
3313 if (sign == -1)
3314 v.Negate();
3315
3316 return v;
3317}
3318
3319Integer::Integer(const char *str, ByteOrder order)
3320 : reg(2), sign(POSITIVE)
3321{
3322 *this = StringToInteger(str,order);
3323}
3324
3325Integer::Integer(const wchar_t *str, ByteOrder order)
3326 : reg(2), sign(POSITIVE)
3327{
3328 *this = StringToInteger(str,order);
3329}
3330
3331unsigned int Integer::WordCount() const
3332{
3333 return (unsigned int)CountWords(reg, reg.size());
3334}
3335
3336unsigned int Integer::ByteCount() const
3337{
3338 unsigned wordCount = WordCount();
3339 if (wordCount)
3340 return (wordCount-1)*WORD_SIZE + BytePrecision(reg[wordCount-1]);
3341 else
3342 return 0;
3343}
3344
3345unsigned int Integer::BitCount() const
3346{
3347 unsigned wordCount = WordCount();
3348 if (wordCount)
3349 return (wordCount-1)*WORD_BITS + BitPrecision(reg[wordCount-1]);
3350 else
3351 return 0;
3352}
3353
3354void Integer::Decode(const byte *input, size_t inputLen, Signedness s)
3355{
3356 CRYPTOPP_ASSERT(input && inputLen); // NULL buffer
3357 StringStore store(input, inputLen);
3358 Decode(store, inputLen, s);
3359}
3360
3362{
3363 CRYPTOPP_ASSERT(bt.MaxRetrievable() >= inputLen);
3364 if (bt.MaxRetrievable() < inputLen)
3365 throw InvalidArgument("Integer: input length is too small");
3366
3367 byte b;
3368 bt.Peek(b);
3369 sign = ((s==SIGNED) && (b & 0x80)) ? NEGATIVE : POSITIVE;
3370
3371 while (inputLen>0 && (sign==POSITIVE ? b==0 : b==0xff))
3372 {
3373 bt.Skip(1);
3374 inputLen--;
3375 bt.Peek(b);
3376 }
3377
3378 reg.CleanNew(RoundupSize(BytesToWords(inputLen)));
3379 for (size_t i=inputLen; i > 0; i--)
3380 {
3381 (void)bt.Get(b);
3382 reg[(i-1)/WORD_SIZE] |= word(b) << ((i-1)%WORD_SIZE)*8;
3383 }
3384
3385 if (sign == NEGATIVE)
3386 {
3387 for (size_t i=inputLen; i<reg.size()*WORD_SIZE; i++)
3388 reg[i/WORD_SIZE] |= word(0xff) << (i%WORD_SIZE)*8;
3389 TwosComplement(reg, reg.size());
3390 }
3391}
3392
3393size_t Integer::MinEncodedSize(Signedness signedness) const
3394{
3395 // Profiling tells us the original second If was dominant, so it
3396 // was promoted to the first If statement. The code change occurred
3397 // at Commit dc99266599a0e72d.
3398 unsigned int outputLen = STDMAX(1U, ByteCount());
3399 const bool pre = (signedness == UNSIGNED);
3400 if (!pre && NotNegative() && (GetByte(outputLen-1) & 0x80))
3401 outputLen++;
3402 if (pre)
3403 return outputLen;
3404 if (IsNegative() && *this < -Power2(outputLen*8-1))
3405 outputLen++;
3406 return outputLen;
3407}
3408
3409// PKCS12_PBKDF and other classes use undersized buffers
3410void Integer::Encode(byte *output, size_t outputLen, Signedness signedness) const
3411{
3412 CRYPTOPP_ASSERT(output && outputLen); // NULL buffer
3413 ArraySink sink(output, outputLen);
3414 Encode(sink, outputLen, signedness);
3415}
3416
3417void Integer::Encode(BufferedTransformation &bt, size_t outputLen, Signedness signedness) const
3418{
3419 if (signedness == UNSIGNED || NotNegative())
3420 {
3421 for (size_t i=outputLen; i > 0; i--)
3422 bt.Put(GetByte(i-1));
3423 }
3424 else
3425 {
3426 // take two's complement of *this
3427 Integer temp = Integer::Power2(8*STDMAX((size_t)ByteCount(), outputLen)) + *this;
3428 temp.Encode(bt, outputLen, UNSIGNED);
3429 }
3430}
3431
3433{
3434 DERGeneralEncoder enc(bt, INTEGER);
3436 enc.MessageEnd();
3437}
3438
3439void Integer::BERDecode(const byte *input, size_t len)
3440{
3441 CRYPTOPP_ASSERT(input && len); // NULL buffer
3442 StringStore store(input, len);
3443 BERDecode(store);
3444}
3445
3447{
3448 BERGeneralDecoder dec(bt, INTEGER);
3449 if (!dec.IsDefiniteLength() || dec.MaxRetrievable() < dec.RemainingLength())
3451 Decode(dec, (size_t)dec.RemainingLength(), SIGNED);
3452 dec.MessageEnd();
3453}
3454
3456{
3457 DERGeneralEncoder enc(bt, OCTET_STRING);
3458 Encode(enc, length);
3459 enc.MessageEnd();
3460}
3461
3463{
3464 BERGeneralDecoder dec(bt, OCTET_STRING);
3465 if (!dec.IsDefiniteLength() || dec.RemainingLength() != length)
3467 Decode(dec, length);
3468 dec.MessageEnd();
3469}
3470
3471size_t Integer::OpenPGPEncode(byte *output, size_t bufferSize) const
3472{
3473 CRYPTOPP_ASSERT(output && bufferSize); // NULL buffer
3474 CRYPTOPP_ASSERT(bufferSize >= MinEncodedSize()); // Undersized buffer
3475 ArraySink sink(output, bufferSize);
3476 return OpenPGPEncode(sink);
3477}
3478
3480{
3481 word16 bitCount = word16(BitCount());
3482 bt.PutWord16(bitCount);
3483 size_t byteCount = BitsToBytes(bitCount);
3484 Encode(bt, byteCount);
3485 return 2 + byteCount;
3486}
3487
3488void Integer::OpenPGPDecode(const byte *input, size_t len)
3489{
3490 CRYPTOPP_ASSERT(input && len); // NULL buffer
3491 StringStore store(input, len);
3492 OpenPGPDecode(store);
3493}
3494
3496{
3497 word16 bitCount;
3498 if (bt.GetWord16(bitCount) != 2 || bt.MaxRetrievable() < BitsToBytes(bitCount))
3499 throw OpenPGPDecodeErr();
3500 Decode(bt, BitsToBytes(bitCount));
3501}
3502
3504{
3505 const size_t nbytes = nbits/8 + 1;
3506 SecByteBlock buf(nbytes);
3507 rng.GenerateBlock(buf, nbytes);
3508 if (nbytes)
3509 buf[0] = (byte)Crop(buf[0], nbits % 8);
3510 Decode(buf, nbytes, UNSIGNED);
3511}
3512
3514{
3515 if (min > max)
3516 throw InvalidArgument("Integer: Min must be no greater than Max");
3517
3518 Integer range = max - min;
3519 const unsigned int nbits = range.BitCount();
3520
3521 do
3522 {
3523 Randomize(rng, nbits);
3524 }
3525 while (*this > range);
3526
3527 *this += min;
3528}
3529
3530bool Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3531{
3532 return GenerateRandomNoThrow(rng, MakeParameters("Min", min)("Max", max)
3533 ("RandomNumberType", rnType)("EquivalentTo", equiv)("Mod", mod));
3534}
3535
3537{
3538public:
3539 KDF2_RNG(const byte *seed, size_t seedSize)
3540 : m_counter(0), m_counterAndSeed(ClampSize(seedSize) + 4)
3541 {
3542 memcpy(m_counterAndSeed + 4, seed, ClampSize(seedSize));
3543 }
3544
3545 void GenerateBlock(byte *output, size_t size)
3546 {
3547 CRYPTOPP_ASSERT(output && size); // NULL buffer
3548 PutWord(false, BIG_ENDIAN_ORDER, m_counterAndSeed, m_counter);
3549 ++m_counter;
3550 P1363_KDF2<SHA1>::DeriveKey(output, size, m_counterAndSeed, m_counterAndSeed.size(), NULLPTR, 0);
3551 }
3552
3553 // UBsan finding, -Wstringop-overflow
3554 inline size_t ClampSize(size_t req) const
3555 {
3556 // Clamp at 16 MB
3557 if (req > 16U*1024*1024)
3558 return 16U*1024*1024;
3559 return req;
3560 }
3561
3562private:
3563 word32 m_counter;
3564 SecByteBlock m_counterAndSeed;
3565};
3566
3568{
3569 Integer min = params.GetValueWithDefault("Min", Integer::Zero());
3570 Integer max;
3571 if (!params.GetValue("Max", max))
3572 {
3573 int bitLength;
3574 if (params.GetIntValue("BitLength", bitLength))
3575 max = Integer::Power2(bitLength);
3576 else
3577 throw InvalidArgument("Integer: missing Max argument");
3578 }
3579 if (min > max)
3580 throw InvalidArgument("Integer: Min must be no greater than Max");
3581
3582 Integer equiv = params.GetValueWithDefault("EquivalentTo", Integer::Zero());
3583 Integer mod = params.GetValueWithDefault("Mod", Integer::One());
3584
3585 if (equiv.IsNegative() || equiv >= mod)
3586 throw InvalidArgument("Integer: invalid EquivalentTo and/or Mod argument");
3587
3588 Integer::RandomNumberType rnType = params.GetValueWithDefault("RandomNumberType", Integer::ANY);
3589
3590 member_ptr<KDF2_RNG> kdf2Rng;
3592 if (params.GetValue(Name::Seed(), seed))
3593 {
3594 ByteQueue bq;
3595 DERSequenceEncoder seq(bq);
3596 min.DEREncode(seq);
3597 max.DEREncode(seq);
3598 equiv.DEREncode(seq);
3599 mod.DEREncode(seq);
3600 DEREncodeUnsigned(seq, rnType);
3601 DEREncodeOctetString(seq, seed.begin(), seed.size());
3602 seq.MessageEnd();
3603
3604 SecByteBlock finalSeed((size_t)bq.MaxRetrievable());
3605 bq.Get(finalSeed, finalSeed.size());
3606 kdf2Rng.reset(new KDF2_RNG(finalSeed.begin(), finalSeed.size()));
3607 }
3608 RandomNumberGenerator &rng = kdf2Rng.get() ? (RandomNumberGenerator &)*kdf2Rng : i_rng;
3609
3610 switch (rnType)
3611 {
3612 case ANY:
3613 if (mod == One())
3614 Randomize(rng, min, max);
3615 else
3616 {
3617 Integer min1 = min + (equiv-min)%mod;
3618 if (max < min1)
3619 return false;
3620 Randomize(rng, Zero(), (max - min1) / mod);
3621 *this *= mod;
3622 *this += min1;
3623 }
3624 return true;
3625
3626 case PRIME:
3627 {
3628 const PrimeSelector *pSelector = params.GetValueWithDefault(Name::PointerToPrimeSelector(), (const PrimeSelector *)NULLPTR);
3629
3630 int i;
3631 i = 0;
3632 while (1)
3633 {
3634 if (++i==16)
3635 {
3636 // check if there are any suitable primes in [min, max]
3637 Integer first = min;
3638 if (FirstPrime(first, max, equiv, mod, pSelector))
3639 {
3640 // if there is only one suitable prime, we're done
3641 *this = first;
3642 if (!FirstPrime(first, max, equiv, mod, pSelector))
3643 return true;
3644 }
3645 else
3646 return false;
3647 }
3648
3649 Randomize(rng, min, max);
3650 if (FirstPrime(*this, STDMIN(*this+mod*PrimeSearchInterval(max), max), equiv, mod, pSelector))
3651 return true;
3652 }
3653 }
3654
3655 default:
3656 throw InvalidArgument("Integer: invalid RandomNumberType argument");
3657 }
3658}
3659
3660std::istream& operator>>(std::istream& in, Integer &a)
3661{
3662 char c;
3663 unsigned int length = 0;
3664 SecBlock<char> str(length + 16);
3665
3666 std::ws(in);
3667
3668 do
3669 {
3670 in.read(&c, 1);
3671 str[length++] = c;
3672 if (length >= str.size())
3673 str.Grow(length + 16);
3674 }
3675 while (in && (c=='-' || c=='x' || (c>='0' && c<='9') || (c>='a' && c<='f') || (c>='A' && c<='F') || c=='h' || c=='H' || c=='o' || c=='O' || c==',' || c=='.'));
3676
3677 if (in.gcount())
3678 in.putback(c);
3679 str[length-1] = '\0';
3680 a = Integer(str);
3681
3682 return in;
3683}
3684
3685// Ensure base 10 is default
3686inline int FlagToBase(long f) {
3687 return f == std::ios::hex ? 16 : (f == std::ios::oct ? 8 : 10);
3688}
3689
3690inline char FlagToSuffix(long f) {
3691 return f == std::ios::hex ? 'h' : (f == std::ios::oct ? 'o' : '.');
3692}
3693
3694// Ensure base 10 is default
3695std::ostream& operator<<(std::ostream& out, const Integer &a)
3696{
3697 // Get relevant conversion specifications from ostream.
3698 const long f = out.flags() & std::ios::basefield;
3699 const int base = FlagToBase(f);
3700 const char suffix = FlagToSuffix(f);
3701
3702 Integer temp1=a, temp2;
3703 if (a.IsNegative())
3704 {
3705 out << '-';
3706 temp1.Negate();
3707 }
3708
3709 if (!a)
3710 out << '0';
3711
3712 static const char upper[]="0123456789ABCDEF";
3713 static const char lower[]="0123456789abcdef";
3714
3715 const char* vec = (out.flags() & std::ios::uppercase) ? upper : lower;
3716 unsigned int i=0;
3717 SecBlock<char> s(a.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
3718
3719 while (!!temp1)
3720 {
3721 word digit;
3722 Integer::Divide(digit, temp2, temp1, base);
3723 s[i++]=vec[digit];
3724 temp1.swap(temp2);
3725 }
3726
3727 while (i--)
3728 {
3729 out << s[i];
3730 }
3731
3732#ifdef CRYPTOPP_USE_STD_SHOWBASE
3733 if (out.flags() & std::ios_base::showbase)
3734 out << suffix;
3735
3736 return out;
3737#else
3738 return out << suffix;
3739#endif
3740}
3741
3743{
3744 if (NotNegative())
3745 {
3746 if (Increment(reg, reg.size()))
3747 {
3748 reg.CleanGrow(2*reg.size());
3749 reg[reg.size()/2]=1;
3750 }
3751 }
3752 else
3753 {
3754 word borrow = Decrement(reg, reg.size());
3755 CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3756
3757 if (WordCount()==0)
3758 *this = Zero();
3759 }
3760 return *this;
3761}
3762
3764{
3765 if (IsNegative())
3766 {
3767 if (Increment(reg, reg.size()))
3768 {
3769 reg.CleanGrow(2*reg.size());
3770 reg[reg.size()/2]=1;
3771 }
3772 }
3773 else
3774 {
3775 if (Decrement(reg, reg.size()))
3776 *this = -One();
3777 }
3778 return *this;
3779}
3780
3781// This is a bit operation. We set sign to POSITIVE, so there's no need to
3782// worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3784{
3785 if (this == &t)
3786 {
3787 return AbsoluteValue();
3788 }
3789 else if (reg.size() >= t.reg.size())
3790 {
3791 Integer result(t);
3792 AndWords(result.reg, reg, t.reg.size());
3793
3794 result.sign = POSITIVE;
3795 return result;
3796 }
3797 else // reg.size() < t.reg.size()
3798 {
3799 Integer result(*this);
3800 AndWords(result.reg, t.reg, reg.size());
3801
3802 result.sign = POSITIVE;
3803 return result;
3804 }
3805}
3806
3807// This is a bit operation. We set sign to POSITIVE, so there's no need to
3808// worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3810{
3811 if (this == &t)
3812 {
3813 return AbsoluteValue();
3814 }
3815 else if (reg.size() >= t.reg.size())
3816 {
3817 Integer result(*this);
3818 OrWords(result.reg, t.reg, t.reg.size());
3819
3820 result.sign = POSITIVE;
3821 return result;
3822 }
3823 else // reg.size() < t.reg.size()
3824 {
3825 Integer result(t);
3826 OrWords(result.reg, reg, reg.size());
3827
3828 result.sign = POSITIVE;
3829 return result;
3830 }
3831}
3832
3833// This is a bit operation. We set sign to POSITIVE, so there's no need to
3834// worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3836{
3837 if (this == &t)
3838 {
3839 return Integer::Zero();
3840 }
3841 else if (reg.size() >= t.reg.size())
3842 {
3843 Integer result(*this);
3844 XorWords(result.reg, t.reg, t.reg.size());
3845
3846 result.sign = POSITIVE;
3847 return result;
3848 }
3849 else // reg.size() < t.reg.size()
3850 {
3851 Integer result(t);
3852 XorWords(result.reg, reg, reg.size());
3853
3854 result.sign = POSITIVE;
3855 return result;
3856 }
3857}
3858
3859void PositiveAdd(Integer &sum, const Integer &a, const Integer& b)
3860{
3861 // Profiling tells us the original second Else If was dominant, so it
3862 // was promoted to the first If statement. The code change occurred at
3863 // Commit dc99266599a0e72d.
3864 int carry; const bool pre = (a.reg.size() == b.reg.size());
3865 if (!pre && a.reg.size() > b.reg.size())
3866 {
3867 carry = Add(sum.reg, a.reg, b.reg, b.reg.size());
3868 CopyWords(sum.reg+b.reg.size(), a.reg+b.reg.size(), a.reg.size()-b.reg.size());
3869 carry = Increment(sum.reg+b.reg.size(), a.reg.size()-b.reg.size(), carry);
3870 }
3871 else if (pre)
3872 {
3873 carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3874 }
3875 else
3876 {
3877 carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3878 CopyWords(sum.reg+a.reg.size(), b.reg+a.reg.size(), b.reg.size()-a.reg.size());
3879 carry = Increment(sum.reg+a.reg.size(), b.reg.size()-a.reg.size(), carry);
3880 }
3881
3882 if (carry)
3883 {
3884 sum.reg.CleanGrow(2*sum.reg.size());
3885 sum.reg[sum.reg.size()/2] = 1;
3886 }
3887 sum.sign = Integer::POSITIVE;
3888}
3889
3890void PositiveSubtract(Integer &diff, const Integer &a, const Integer& b)
3891{
3892 unsigned aSize = a.WordCount();
3893 aSize += aSize%2;
3894 unsigned bSize = b.WordCount();
3895 bSize += bSize%2;
3896
3897 // Profiling tells us the original second Else If was dominant, so it
3898 // was promoted to the first If statement. The code change occurred at
3899 // Commit dc99266599a0e72d.
3900 if (aSize > bSize)
3901 {
3902 word borrow = Subtract(diff.reg, a.reg, b.reg, bSize);
3903 CopyWords(diff.reg+bSize, a.reg+bSize, aSize-bSize);
3904 borrow = Decrement(diff.reg+bSize, aSize-bSize, borrow);
3905 CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3906 diff.sign = Integer::POSITIVE;
3907 }
3908 else if (aSize == bSize)
3909 {
3910 if (Compare(a.reg, b.reg, aSize) >= 0)
3911 {
3912 Subtract(diff.reg, a.reg, b.reg, aSize);
3913 diff.sign = Integer::POSITIVE;
3914 }
3915 else
3916 {
3917 Subtract(diff.reg, b.reg, a.reg, aSize);
3918 diff.sign = Integer::NEGATIVE;
3919 }
3920 }
3921 else
3922 {
3923 word borrow = Subtract(diff.reg, b.reg, a.reg, aSize);
3924 CopyWords(diff.reg+aSize, b.reg+aSize, bSize-aSize);
3925 borrow = Decrement(diff.reg+aSize, bSize-aSize, borrow);
3926 CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3927 diff.sign = Integer::NEGATIVE;
3928 }
3929}
3930
3931// MSVC .NET 2003 workaround
3932template <class T> inline const T& STDMAX2(const T& a, const T& b)
3933{
3934 return a < b ? b : a;
3935}
3936
3938{
3939 Integer sum((word)0, STDMAX2(reg.size(), b.reg.size()));
3940 if (NotNegative())
3941 {
3942 if (b.NotNegative())
3943 PositiveAdd(sum, *this, b);
3944 else
3945 PositiveSubtract(sum, *this, b);
3946 }
3947 else
3948 {
3949 if (b.NotNegative())
3950 PositiveSubtract(sum, b, *this);
3951 else
3952 {
3953 PositiveAdd(sum, *this, b);
3954 sum.sign = Integer::NEGATIVE;
3955 }
3956 }
3957 return sum;
3958}
3959
3961{
3962 reg.CleanGrow(t.reg.size());
3963 if (NotNegative())
3964 {
3965 if (t.NotNegative())
3966 PositiveAdd(*this, *this, t);
3967 else
3968 PositiveSubtract(*this, *this, t);
3969 }
3970 else
3971 {
3972 if (t.NotNegative())
3973 PositiveSubtract(*this, t, *this);
3974 else
3975 {
3976 PositiveAdd(*this, *this, t);
3977 sign = Integer::NEGATIVE;
3978 }
3979 }
3980 return *this;
3981}
3982
3984{
3985 Integer diff((word)0, STDMAX2(reg.size(), b.reg.size()));
3986 if (NotNegative())
3987 {
3988 if (b.NotNegative())
3989 PositiveSubtract(diff, *this, b);
3990 else
3991 PositiveAdd(diff, *this, b);
3992 }
3993 else
3994 {
3995 if (b.NotNegative())
3996 {
3997 PositiveAdd(diff, *this, b);
3998 diff.sign = Integer::NEGATIVE;
3999 }
4000 else
4001 PositiveSubtract(diff, b, *this);
4002 }
4003 return diff;
4004}
4005
4007{
4008 reg.CleanGrow(t.reg.size());
4009 if (NotNegative())
4010 {
4011 if (t.NotNegative())
4012 PositiveSubtract(*this, *this, t);
4013 else
4014 PositiveAdd(*this, *this, t);
4015 }
4016 else
4017 {
4018 if (t.NotNegative())
4019 {
4020 PositiveAdd(*this, *this, t);
4021 sign = Integer::NEGATIVE;
4022 }
4023 else
4024 PositiveSubtract(*this, t, *this);
4025 }
4026 return *this;
4027}
4028
4030{
4031 const size_t wordCount = WordCount();
4032 const size_t shiftWords = n / WORD_BITS;
4033 const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4034
4035 reg.CleanGrow(RoundupSize(wordCount+BitsToWords(n)));
4036 ShiftWordsLeftByWords(reg, wordCount + shiftWords, shiftWords);
4037 ShiftWordsLeftByBits(reg+shiftWords, wordCount+BitsToWords(shiftBits), shiftBits);
4038 return *this;
4039}
4040
4042{
4043 const size_t wordCount = WordCount();
4044 const size_t shiftWords = n / WORD_BITS;
4045 const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4046
4047 ShiftWordsRightByWords(reg, wordCount, shiftWords);
4048 if (wordCount > shiftWords)
4049 ShiftWordsRightByBits(reg, wordCount-shiftWords, shiftBits);
4050 if (IsNegative() && WordCount()==0) // avoid -0
4051 *this = Zero();
4052 return *this;
4053}
4054
4056{
4057 if (this != &t)
4058 {
4059 const size_t size = STDMIN(reg.size(), t.reg.size());
4060 reg.resize(size);
4061 AndWords(reg, t.reg, size);
4062 }
4063 sign = POSITIVE;
4064 return *this;
4065}
4066
4068{
4069 if (this != &t)
4070 {
4071 if (reg.size() >= t.reg.size())
4072 {
4073 OrWords(reg, t.reg, t.reg.size());
4074 }
4075 else // reg.size() < t.reg.size()
4076 {
4077 const size_t head = reg.size();
4078 const size_t tail = t.reg.size() - reg.size();
4079 reg.resize(head+tail);
4080 OrWords(reg, t.reg, head);
4081 CopyWords(reg+head,t.reg+head,tail);
4082 }
4083 }
4084 sign = POSITIVE;
4085 return *this;
4086}
4087
4089{
4090 if (this == &t)
4091 {
4092 *this = Zero();
4093 }
4094 else
4095 {
4096 if (reg.size() >= t.reg.size())
4097 {
4098 XorWords(reg, t.reg, t.reg.size());
4099 }
4100 else // reg.size() < t.reg.size()
4101 {
4102 const size_t head = reg.size();
4103 const size_t tail = t.reg.size() - reg.size();
4104 reg.resize(head+tail);
4105 XorWords(reg, t.reg, head);
4106 CopyWords(reg+head,t.reg+head,tail);
4107 }
4108 }
4109 sign = POSITIVE;
4110 return *this;
4111}
4112
4113void PositiveMultiply(Integer &product, const Integer &a, const Integer &b)
4114{
4115 size_t aSize = RoundupSize(a.WordCount());
4116 size_t bSize = RoundupSize(b.WordCount());
4117
4118 product.reg.CleanNew(RoundupSize(aSize+bSize));
4119 product.sign = Integer::POSITIVE;
4120
4121 IntegerSecBlock workspace(aSize + bSize);
4122 AsymmetricMultiply(product.reg, workspace, a.reg, aSize, b.reg, bSize);
4123}
4124
4125void Multiply(Integer &product, const Integer &a, const Integer &b)
4126{
4127 PositiveMultiply(product, a, b);
4128
4129 if (a.NotNegative() != b.NotNegative())
4130 product.Negate();
4131}
4132
4134{
4135 Integer product;
4136 Multiply(product, *this, b);
4137 return product;
4138}
4139
4140/*
4141void PositiveDivide(Integer &remainder, Integer &quotient,
4142 const Integer &dividend, const Integer &divisor)
4143{
4144 remainder.reg.CleanNew(divisor.reg.size());
4145 remainder.sign = Integer::POSITIVE;
4146 quotient.reg.New(0);
4147 quotient.sign = Integer::POSITIVE;
4148 unsigned i=dividend.BitCount();
4149 while (i--)
4150 {
4151 word overflow = ShiftWordsLeftByBits(remainder.reg, remainder.reg.size(), 1);
4152 remainder.reg[0] |= dividend[i];
4153 if (overflow || remainder >= divisor)
4154 {
4155 Subtract(remainder.reg, remainder.reg, divisor.reg, remainder.reg.size());
4156 quotient.SetBit(i);
4157 }
4158 }
4159}
4160*/
4161
4162void PositiveDivide(Integer &remainder, Integer &quotient,
4163 const Integer &a, const Integer &b)
4164{
4165 unsigned aSize = a.WordCount();
4166 unsigned bSize = b.WordCount();
4167
4168 if (!bSize)
4169 throw Integer::DivideByZero();
4170
4171 if (aSize < bSize)
4172 {
4173 remainder = a;
4174 remainder.sign = Integer::POSITIVE;
4175 quotient = Integer::Zero();
4176 return;
4177 }
4178
4179 aSize += aSize%2; // round up to next even number
4180 bSize += bSize%2;
4181
4182 remainder.reg.CleanNew(RoundupSize(bSize));
4183 remainder.sign = Integer::POSITIVE;
4184 quotient.reg.CleanNew(RoundupSize(aSize-bSize+2));
4185 quotient.sign = Integer::POSITIVE;
4186
4187 IntegerSecBlock T(aSize+3*(bSize+2));
4188 Divide(remainder.reg, quotient.reg, T, a.reg, aSize, b.reg, bSize);
4189}
4190
4191void Integer::Divide(Integer &remainder, Integer &quotient, const Integer &dividend, const Integer &divisor)
4192{
4193 PositiveDivide(remainder, quotient, dividend, divisor);
4194
4195 if (dividend.IsNegative())
4196 {
4197 quotient.Negate();
4198 if (remainder.NotZero())
4199 {
4200 --quotient;
4201 remainder = divisor.AbsoluteValue() - remainder;
4202 }
4203 }
4204
4205 if (divisor.IsNegative())
4206 quotient.Negate();
4207}
4208
4209void Integer::DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
4210{
4211 q = a;
4212 q >>= n;
4213
4214 const size_t wordCount = BitsToWords(n);
4215 if (wordCount <= a.WordCount())
4216 {
4217 r.reg.resize(RoundupSize(wordCount));
4218 CopyWords(r.reg, a.reg, wordCount);
4219 SetWords(r.reg+wordCount, 0, r.reg.size()-wordCount);
4220 if (n % WORD_BITS != 0)
4221 r.reg[wordCount-1] %= (word(1) << (n % WORD_BITS));
4222 }
4223 else
4224 {
4225 r.reg.resize(RoundupSize(a.WordCount()));
4226 CopyWords(r.reg, a.reg, r.reg.size());
4227 }
4228 r.sign = POSITIVE;
4229
4230 if (a.IsNegative() && r.NotZero())
4231 {
4232 --q;
4233 r = Power2(n) - r;
4234 }
4235}
4236
4238{
4239 Integer remainder, quotient;
4240 Integer::Divide(remainder, quotient, *this, b);
4241 return quotient;
4242}
4243
4245{
4246 Integer remainder, quotient;
4247 Integer::Divide(remainder, quotient, *this, b);
4248 return remainder;
4249}
4250
4251void Integer::Divide(word &remainder, Integer &quotient, const Integer &dividend, word divisor)
4252{
4253 if (!divisor)
4254 throw Integer::DivideByZero();
4255
4256 // IsPowerOf2 uses BMI on x86 if available. There is a small
4257 // but measurable improvement during decryption and signing.
4258 if (IsPowerOf2(divisor))
4259 {
4260 quotient = dividend >> (BitPrecision(divisor)-1);
4261 remainder = dividend.reg[0] & (divisor-1);
4262 return;
4263 }
4264
4265 unsigned int i = dividend.WordCount();
4266 quotient.reg.CleanNew(RoundupSize(i));
4267 remainder = 0;
4268 while (i--)
4269 {
4270 quotient.reg[i] = DWord(dividend.reg[i], remainder) / divisor;
4271 remainder = DWord(dividend.reg[i], remainder) % divisor;
4272 }
4273
4274 if (dividend.NotNegative())
4275 quotient.sign = POSITIVE;
4276 else
4277 {
4278 quotient.sign = NEGATIVE;
4279 if (remainder)
4280 {
4281 --quotient;
4282 remainder = divisor - remainder;
4283 }
4284 }
4285}
4286
4288{
4289 word remainder;
4290 Integer quotient;
4291 Integer::Divide(remainder, quotient, *this, b);
4292 return quotient;
4293}
4294
4295word Integer::Modulo(word divisor) const
4296{
4297 if (!divisor)
4298 throw Integer::DivideByZero();
4299
4300 word remainder;
4301
4302 // Profiling tells us the original Else was dominant, so it was
4303 // promoted to the first If statement. The code change occurred
4304 // at Commit dc99266599a0e72d.
4305 if ((divisor & (divisor-1)) != 0) // divisor is not a power of 2
4306 {
4307 // Profiling tells us the original Else was dominant, so it
4308 // was promoted to the first If statement. The code change
4309 // occurred at Commit dc99266599a0e72d.
4310 unsigned int i = WordCount();
4311 if (divisor > 5)
4312 {
4313 remainder = 0;
4314 while (i--)
4315 remainder = DWord(reg[i], remainder) % divisor;
4316 }
4317 else
4318 {
4319 DWord sum(0, 0);
4320 while (i--)
4321 sum += reg[i];
4322 remainder = sum % divisor;
4323 }
4324 }
4325 else // divisor is a power of 2
4326 {
4327 remainder = reg[0] & (divisor-1);
4328 }
4329
4330 if (IsNegative() && remainder)
4331 remainder = divisor - remainder;
4332
4333 return remainder;
4334}
4335
4337{
4338 if (!!(*this)) // don't flip sign if *this==0
4339 sign = Sign(1-sign);
4340}
4341
4342int Integer::PositiveCompare(const Integer& t) const
4343{
4344 // Profiling tells us the original Else was dominant, so it
4345 // was promoted to the first If statement. The code change
4346 // occurred at Commit dc99266599a0e72d.
4347 const unsigned size = WordCount(), tSize = t.WordCount();
4348 if (size != tSize)
4349 return size > tSize ? 1 : -1;
4350 else
4351 return CryptoPP::Compare(reg, t.reg, size);
4352}
4353
4354int Integer::Compare(const Integer& t) const
4355{
4356 if (NotNegative())
4357 {
4358 if (t.NotNegative())
4359 return PositiveCompare(t);
4360 else
4361 return 1;
4362 }
4363 else
4364 {
4365 if (t.NotNegative())
4366 return -1;
4367 else
4368 return -PositiveCompare(t);
4369 }
4370}
4371
4373{
4374 if (!IsPositive())
4375 return Zero();
4376
4377 // overestimate square root
4378 Integer x, y = Power2((BitCount()+1)/2);
4379 CRYPTOPP_ASSERT(y*y >= *this);
4380
4381 do
4382 {
4383 x = y;
4384 y = (x + *this/x) >> 1;
4385 } while (y<x);
4386
4387 return x;
4388}
4389
4391{
4392 Integer r = SquareRoot();
4393 return *this == r.Squared();
4394}
4395
4397{
4398 return (WordCount() == 1) && (reg[0] == 1);
4399}
4400
4402{
4403 return IsUnit() ? *this : Zero();
4404}
4405
4406Integer a_times_b_mod_c(const Integer &x, const Integer& y, const Integer& m)
4407{
4409 if (m.IsZero())
4410 throw Integer::DivideByZero();
4411
4412 return x*y%m;
4413}
4414
4415Integer a_exp_b_mod_c(const Integer &x, const Integer& e, const Integer& m)
4416{
4418 if (m.IsZero())
4419 throw Integer::DivideByZero();
4420
4421 ModularArithmetic mr(m);
4422 return mr.Exponentiate(x, e);
4423}
4424
4426{
4427 return EuclideanDomainOf<Integer>().Gcd(a, b);
4428}
4429
4431{
4434
4435 if (IsNegative())
4436 return Modulo(m).InverseModNext(m);
4437
4438 // http://github.com/weidai11/cryptopp/issues/602
4439 if (*this >= m)
4440 return Modulo(m).InverseModNext(m);
4441
4442 return InverseModNext(m);
4443}
4444
4445Integer Integer::InverseModNext(const Integer &m) const
4446{
4449
4450 if (m.IsEven())
4451 {
4452 if (!m || IsEven())
4453 return Zero(); // no inverse
4454 if (*this == One())
4455 return One();
4456
4457 Integer u = m.Modulo(*this).InverseModNext(*this);
4458 return !u ? Zero() : (m*(*this-u)+1)/(*this);
4459 }
4460
4461 // AlmostInverse requires a 4x workspace
4462 IntegerSecBlock T(m.reg.size() * 4);
4463 Integer r((word)0, m.reg.size());
4464 unsigned k = AlmostInverse(r.reg, T, reg, reg.size(), m.reg, m.reg.size());
4465 DivideByPower2Mod(r.reg, r.reg, k, m.reg, m.reg.size());
4466 return r;
4467}
4468
4469word Integer::InverseMod(word mod) const
4470{
4471 CRYPTOPP_ASSERT(mod != 0);
4472
4473 word g0 = mod, g1 = *this % mod;
4474 word v0 = 0, v1 = 1;
4475 word y;
4476
4477 while (g1)
4478 {
4479 if (g1 == 1)
4480 return v1;
4481 y = g0 / g1;
4482 g0 = g0 % g1;
4483 v0 += y * v1;
4484
4485 if (!g0)
4486 break;
4487 if (g0 == 1)
4488 return mod-v0;
4489 y = g1 / g0;
4490 g1 = g1 % g0;
4491 v1 += y * v0;
4492 }
4493 return 0;
4494}
4495
4496// ********************************************************
4497
4499{
4500 BERSequenceDecoder seq(bt);
4501 OID oid(seq);
4502 if (oid != ASN1::prime_field())
4504 m_modulus.BERDecode(seq);
4505 seq.MessageEnd();
4506 m_result.reg.resize(m_modulus.reg.size());
4507}
4508
4510{
4511 DERSequenceEncoder seq(bt);
4512 ASN1::prime_field().DEREncode(seq);
4513 m_modulus.DEREncode(seq);
4514 seq.MessageEnd();
4515}
4516
4518{
4519 a.DEREncodeAsOctetString(out, MaxElementByteLength());
4520}
4521
4523{
4524 a.BERDecodeAsOctetString(in, MaxElementByteLength());
4525}
4526
4528{
4529 if (a.reg.size()==m_modulus.reg.size())
4530 {
4531 CryptoPP::DivideByPower2Mod(m_result.reg.begin(), a.reg, 1, m_modulus.reg, a.reg.size());
4532 return m_result;
4533 }
4534 else
4535 return m_result1 = (a.IsEven() ? (a >> 1) : ((a+m_modulus) >> 1));
4536}
4537
4538const Integer& ModularArithmetic::Add(const Integer &a, const Integer &b) const
4539{
4540 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4541 {
4542 if (CryptoPP::Add(m_result.reg.begin(), a.reg, b.reg, a.reg.size())
4543 || Compare(m_result.reg, m_modulus.reg, a.reg.size()) >= 0)
4544 {
4545 CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4546 }
4547 return m_result;
4548 }
4549 else
4550 {
4551 m_result1 = a+b;
4552 if (m_result1 >= m_modulus)
4553 m_result1 -= m_modulus;
4554 return m_result1;
4555 }
4556}
4557
4559{
4560 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4561 {
4562 if (CryptoPP::Add(a.reg, a.reg, b.reg, a.reg.size())
4563 || Compare(a.reg, m_modulus.reg, a.reg.size()) >= 0)
4564 {
4565 CryptoPP::Subtract(a.reg, a.reg, m_modulus.reg, a.reg.size());
4566 }
4567 }
4568 else
4569 {
4570 a+=b;
4571 if (a>=m_modulus)
4572 a-=m_modulus;
4573 }
4574
4575 return a;
4576}
4577
4578const Integer& ModularArithmetic::Subtract(const Integer &a, const Integer &b) const
4579{
4580 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4581 {
4582 if (CryptoPP::Subtract(m_result.reg.begin(), a.reg, b.reg, a.reg.size()))
4583 CryptoPP::Add(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4584 return m_result;
4585 }
4586 else
4587 {
4588 m_result1 = a-b;
4589 if (m_result1.IsNegative())
4590 m_result1 += m_modulus;
4591 return m_result1;
4592 }
4593}
4594
4596{
4597 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4598 {
4599 if (CryptoPP::Subtract(a.reg, a.reg, b.reg, a.reg.size()))
4600 CryptoPP::Add(a.reg, a.reg, m_modulus.reg, a.reg.size());
4601 }
4602 else
4603 {
4604 a-=b;
4605 if (a.IsNegative())
4606 a+=m_modulus;
4607 }
4608
4609 return a;
4610}
4611
4613{
4614 if (!a)
4615 return a;
4616
4617 CopyWords(m_result.reg.begin(), m_modulus.reg, m_modulus.reg.size());
4618 if (CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, a.reg, a.reg.size()))
4619 Decrement(m_result.reg.begin()+a.reg.size(), m_modulus.reg.size()-a.reg.size());
4620
4621 return m_result;
4622}
4623
4624Integer ModularArithmetic::CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
4625{
4626 if (m_modulus.IsOdd())
4627 {
4628 MontgomeryRepresentation dr(m_modulus);
4629 return dr.ConvertOut(dr.CascadeExponentiate(dr.ConvertIn(x), e1, dr.ConvertIn(y), e2));
4630 }
4631 else
4632 return AbstractRing<Integer>::CascadeExponentiate(x, e1, y, e2);
4633}
4634
4635void ModularArithmetic::SimultaneousExponentiate(Integer *results, const Integer &base, const Integer *exponents, unsigned int exponentsCount) const
4636{
4637 if (m_modulus.IsOdd())
4638 {
4639 MontgomeryRepresentation dr(m_modulus);
4640 dr.SimultaneousExponentiate(results, dr.ConvertIn(base), exponents, exponentsCount);
4641 for (unsigned int i=0; i<exponentsCount; i++)
4642 results[i] = dr.ConvertOut(results[i]);
4643 }
4644 else
4645 AbstractRing<Integer>::SimultaneousExponentiate(results, base, exponents, exponentsCount);
4646}
4647
4649 : ModularArithmetic(m),
4650 m_u((word)0, m_modulus.reg.size()),
4651 m_workspace(5*m_modulus.reg.size())
4652{
4653 if (!m_modulus.IsOdd())
4654 throw InvalidArgument("MontgomeryRepresentation: Montgomery representation requires an odd modulus");
4655
4656 RecursiveInverseModPower2(m_u.reg, m_workspace, m_modulus.reg, m_modulus.reg.size());
4657}
4658
4660{
4661 word *const T = m_workspace.begin();
4662 word *const R = m_result.reg.begin();
4663 const size_t N = m_modulus.reg.size();
4664 CRYPTOPP_ASSERT(a.reg.size()<=N && b.reg.size()<=N);
4665
4666 AsymmetricMultiply(T, T+2*N, a.reg, a.reg.size(), b.reg, b.reg.size());
4667 SetWords(T+a.reg.size()+b.reg.size(), 0, 2*N-a.reg.size()-b.reg.size());
4668 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4669 return m_result;
4670}
4671
4673{
4674 word *const T = m_workspace.begin();
4675 word *const R = m_result.reg.begin();
4676 const size_t N = m_modulus.reg.size();
4677 CRYPTOPP_ASSERT(a.reg.size()<=N);
4678
4679 CryptoPP::Square(T, T+2*N, a.reg, a.reg.size());
4680 SetWords(T+2*a.reg.size(), 0, 2*N-2*a.reg.size());
4681 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4682 return m_result;
4683}
4684
4686{
4687 word *const T = m_workspace.begin();
4688 word *const R = m_result.reg.begin();
4689 const size_t N = m_modulus.reg.size();
4690 CRYPTOPP_ASSERT(a.reg.size()<=N);
4691
4692 CopyWords(T, a.reg, a.reg.size());
4693 SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4694 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4695 return m_result;
4696}
4697
4699{
4700// return (EuclideanMultiplicativeInverse(a, modulus)<<(2*WORD_BITS*modulus.reg.size()))%modulus;
4701 word *const T = m_workspace.begin();
4702 word *const R = m_result.reg.begin();
4703 const size_t N = m_modulus.reg.size();
4704 CRYPTOPP_ASSERT(a.reg.size()<=N);
4705
4706 CopyWords(T, a.reg, a.reg.size());
4707 SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4708 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4709 unsigned k = AlmostInverse(R, T, R, N, m_modulus.reg, N);
4710
4711// cout << "k=" << k << " N*32=" << 32*N << endl;
4712
4713 if (k>N*WORD_BITS)
4714 DivideByPower2Mod(R, R, k-N*WORD_BITS, m_modulus.reg, N);
4715 else
4716 MultiplyByPower2Mod(R, R, N*WORD_BITS-k, m_modulus.reg, N);
4717
4718 return m_result;
4719}
4720
4721// Specialization declared in misc.h to allow us to print integers
4722// with additional control options, like arbirary bases and uppercase.
4723template <> CRYPTOPP_DLL
4724std::string IntToString<Integer>(Integer value, unsigned int base)
4725{
4726 // Hack... set the high bit for uppercase. Set the next bit fo a suffix.
4727 static const unsigned int BIT_32 = (1U << 31);
4728 const bool UPPER = !!(base & BIT_32);
4729 static const unsigned int BIT_31 = (1U << 30);
4730 const bool BASE = !!(base & BIT_31);
4731
4732 const char CH = UPPER ? 'A' : 'a';
4733 base &= ~(BIT_32|BIT_31);
4734 CRYPTOPP_ASSERT(base >= 2 && base <= 32);
4735
4736 if (value == 0)
4737 return "0";
4738
4739 bool negative = false, zero = false;
4740 if (value.IsNegative())
4741 {
4742 negative = true;
4743 value.Negate();
4744 }
4745
4746 if (!value)
4747 zero = true;
4748
4749 SecBlock<char> s(value.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
4750 Integer temp;
4751
4752 unsigned int i=0;
4753 while (!!value)
4754 {
4755 word digit;
4756 Integer::Divide(digit, temp, value, word(base));
4757 s[i++]=char((digit < 10 ? '0' : (CH - 10)) + digit);
4758 value.swap(temp);
4759 }
4760
4761 std::string result;
4762 result.reserve(i+2);
4763
4764 if (negative)
4765 result += '-';
4766
4767 if (zero)
4768 result += '0';
4769
4770 while (i--)
4771 result += s[i];
4772
4773 if (BASE)
4774 {
4775 if (base == 10)
4776 result += '.';
4777 else if (base == 16)
4778 result += 'h';
4779 else if (base == 8)
4780 result += 'o';
4781 else if (base == 2)
4782 result += 'b';
4783 }
4784
4785 return result;
4786}
4787
4788// Specialization declared in misc.h to avoid Coverity findings.
4789template <> CRYPTOPP_DLL
4790std::string IntToString<word64>(word64 value, unsigned int base)
4791{
4792 // Hack... set the high bit for uppercase.
4793 static const unsigned int HIGH_BIT = (1U << 31);
4794 const char CH = !!(base & HIGH_BIT) ? 'A' : 'a';
4795 base &= ~HIGH_BIT;
4796
4797 CRYPTOPP_ASSERT(base >= 2);
4798 if (value == 0)
4799 return "0";
4800
4801 std::string result;
4802 while (value > 0)
4803 {
4804 word64 digit = value % base;
4805 result = char((digit < 10 ? '0' : (CH - 10)) + digit) + result;
4806 value /= base;
4807 }
4808 return result;
4809}
4810
4811#ifndef CRYPTOPP_NO_ASSIGN_TO_INTEGER
4812// Allow the linker to discard Integer code if not needed.
4813// Also see http://github.com/weidai11/cryptopp/issues/389.
4814bool AssignIntToInteger(const std::type_info &valueType, void *pInteger, const void *pInt)
4815{
4816 if (valueType != typeid(Integer))
4817 return false;
4818 *reinterpret_cast<Integer *>(pInteger) = *reinterpret_cast<const int *>(pInt);
4819 return true;
4820}
4821#endif // CRYPTOPP_NO_ASSIGN_TO_INTEGER
4822
4823// *************************** C++ Static Initialization ***************************
4824
4826{
4827public:
4828 InitInteger()
4829 {
4830 SetFunctionPointers();
4831 }
4832};
4833
4834// This is not really needed because each Integer can dynamically initialize
4835// itself, but we take a peephole optimization and initialize the class once
4836// if init priorities are available. Dynamic initialization will be used if
4837// init priorities are not available.
4838
4839#if defined(HAVE_GCC_INIT_PRIORITY)
4840 const InitInteger s_init __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 10))) = InitInteger();
4841 const Integer g_zero __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 11))) = Integer(0L);
4842 const Integer g_one __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 12))) = Integer(1L);
4843 const Integer g_two __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 13))) = Integer(2L);
4844#elif defined(HAVE_MSC_INIT_PRIORITY)
4845 #pragma warning(disable: 4075)
4846 #pragma init_seg(".CRT$XCU")
4847 const InitInteger s_init;
4848 const Integer g_zero(0L);
4849 const Integer g_one(1L);
4850 const Integer g_two(2L);
4851 #pragma warning(default: 4075)
4852#elif HAVE_XLC_INIT_PRIORITY
4853 // XLC needs constant, not a define
4854 #pragma priority(280)
4855 const InitInteger s_init;
4856 const Integer g_zero(0L);
4857 const Integer g_one(1L);
4858 const Integer g_two(2L);
4859#else
4860 const InitInteger s_init;
4861#endif
4862
4863// ***************** Library code ********************
4864
4866{
4867#if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4868 return g_zero;
4869#elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4870 static const Integer s_zero(0L);
4871 return s_zero;
4872#else // Potential memory leak. Avoid if possible.
4873 return Singleton<Integer, NewInteger<0L> >().Ref();
4874#endif
4875}
4876
4878{
4879#if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4880 return g_one;
4881#elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4882 static const Integer s_one(1L);
4883 return s_one;
4884#else // Potential memory leak. Avoid if possible.
4885 return Singleton<Integer, NewInteger<1L> >().Ref();
4886#endif
4887}
4888
4890{
4891#if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4892 return g_two;
4893#elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4894 static const Integer s_two(2L);
4895 return s_two;
4896#else // Potential memory leak. Avoid if possible.
4897 return Singleton<Integer, NewInteger<2L> >().Ref();
4898#endif
4899}
4900
4901NAMESPACE_END
4902
4903#endif // CRYPTOPP_IMPORTS
Classes for working with NameValuePairs.
AlgorithmParameters MakeParameters(const char *name, const T &value, bool throwIfNotUsed=true)
Create an object that implements NameValuePairs.
Definition: algparam.h:502
Classes and functions for working with ANS.1 objects.
size_t DEREncodeUnsigned(BufferedTransformation &out, T w, byte asnTag=INTEGER)
DER Encode unsigned value.
Definition: asn.h:461
void BERDecodeError()
Raises a BERDecodeErr.
Definition: asn.h:69
virtual const Element & Gcd(const Element &a, const Element &b) const
Calculates the greatest common denominator in the ring.
Definition: algebra.cpp:56
virtual Element Exponentiate(const Element &a, const Integer &e) const
Raises a base to an exponent in the group.
Definition: algebra.cpp:316
virtual void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the Ring.
Definition: algebra.cpp:334
virtual Element CascadeExponentiate(const Element &x, const Integer &e1, const Element &y, const Integer &e2) const
TODO.
Definition: algebra.cpp:323
Copy input to a memory buffer.
Definition: filters.h:1137
BER General Decoder.
Definition: asn.h:259
BER Sequence Decoder.
Definition: asn.h:310
Interface for buffered transformations.
Definition: cryptlib.h:1599
virtual lword Skip(lword skipMax=LWORD_MAX)
Discard skipMax bytes from the output buffer.
Definition: cryptlib.cpp:573
size_t GetWord16(word16 &value, ByteOrder order=BIG_ENDIAN_ORDER)
Retrieve a 16-bit word.
Definition: cryptlib.cpp:793
virtual lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition: cryptlib.cpp:504
size_t PutWord16(word16 value, ByteOrder order=BIG_ENDIAN_ORDER, bool blocking=true)
Input a 16-bit word for processing.
Definition: cryptlib.cpp:745
virtual size_t Get(byte &outByte)
Retrieve a 8-bit byte.
Definition: cryptlib.cpp:527
virtual size_t Peek(byte &outByte) const
Peek a 8-bit byte.
Definition: cryptlib.cpp:550
size_t Put(byte inByte, bool blocking=true)
Input a byte for processing.
Definition: cryptlib.h:1620
Data structure used to store byte strings.
Definition: queue.h:19
size_t Get(byte &outByte)
Retrieve a 8-bit byte.
Definition: queue.cpp:301
lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition: queue.h:33
Used to pass byte array input as part of a NameValuePairs object.
Definition: algparam.h:21
const byte * begin() const
Pointer to the first byte in the memory block.
Definition: algparam.h:80
size_t size() const
Length of the memory block.
Definition: algparam.h:84
DER General Encoder.
Definition: asn.h:292
DER Sequence Encoder.
Definition: asn.h:320
Euclidean domain.
Definition: algebra.h:316
Exception thrown when division by 0 is encountered.
Definition: integer.h:56
Exception thrown when an error is encountered decoding an OpenPGP integer.
Definition: integer.h:283
Exception thrown when a random number cannot be found that satisfies the condition.
Definition: integer.h:64
Multiple precision integer with arithmetic operations.
Definition: integer.h:50
void DEREncode(BufferedTransformation &bt) const
Encode in DER format.
Definition: integer.cpp:3432
Integer & operator>>=(size_t n)
Right-shift Assignment.
Definition: integer.cpp:4041
Integer & operator&=(const Integer &t)
Bitwise AND Assignment.
Definition: integer.cpp:4055
bool GetBit(size_t i) const
Provides the i-th bit of the Integer.
Definition: integer.cpp:3103
void SetByte(size_t n, byte value)
Set the n-th byte to value.
Definition: integer.cpp:3139
bool IsPositive() const
Determines if the Integer is positive.
Definition: integer.h:342
static const Integer & Zero()
Integer representing 0.
Definition: integer.cpp:4865
Integer Minus(const Integer &b) const
Subtraction.
Definition: integer.cpp:3983
static Integer Gcd(const Integer &a, const Integer &n)
Calculate greatest common divisor.
Definition: integer.cpp:4425
signed long ConvertToLong() const
Convert the Integer to Long.
Definition: integer.cpp:3012
Integer operator-() const
Subtraction.
Definition: integer.cpp:3155
void SetBit(size_t n, bool value=1)
Set the n-th bit to value.
Definition: integer.cpp:3114
Integer & operator+=(const Integer &t)
Addition Assignment.
Definition: integer.cpp:3960
Integer And(const Integer &t) const
Bitwise AND.
Definition: integer.cpp:3783
bool IsSquare() const
Determine whether this integer is a perfect square.
Definition: integer.cpp:4390
Integer Plus(const Integer &b) const
Addition.
Definition: integer.cpp:3937
Integer DividedBy(const Integer &b) const
Division.
Definition: integer.cpp:4237
Integer & operator++()
Pre-increment.
Definition: integer.cpp:3742
void DEREncodeAsOctetString(BufferedTransformation &bt, size_t length) const
Encode absolute value as big-endian octet string.
Definition: integer.cpp:3455
void OpenPGPDecode(const byte *input, size_t inputLen)
Decode from OpenPGP format.
Definition: integer.cpp:3488
bool NotZero() const
Determines if the Integer is non-0.
Definition: integer.h:333
Integer Times(const Integer &b) const
Multiplication.
Definition: integer.cpp:4133
static void Divide(Integer &r, Integer &q, const Integer &a, const Integer &d)
Extended Division.
Definition: integer.cpp:4191
void BERDecodeAsOctetString(BufferedTransformation &bt, size_t length)
Decode nonnegative value from big-endian octet string.
Definition: integer.cpp:3462
Integer & operator--()
Pre-decrement.
Definition: integer.cpp:3763
byte GetByte(size_t i) const
Provides the i-th byte of the Integer.
Definition: integer.cpp:3128
void Randomize(RandomNumberGenerator &rng, size_t bitCount)
Set this Integer to random integer.
Definition: integer.cpp:3503
bool IsConvertableToLong() const
Determines if the Integer is convertable to Long.
Definition: integer.cpp:2998
Integer Or(const Integer &t) const
Bitwise OR.
Definition: integer.cpp:3809
lword GetBits(size_t i, size_t n) const
Provides the low order bits of the Integer.
Definition: integer.cpp:3146
Integer Squared() const
Multiply this integer by itself.
Definition: integer.h:609
Integer()
Creates the zero integer.
Definition: integer.cpp:2958
void BERDecode(const byte *input, size_t inputLen)
Decode from BER format.
Definition: integer.cpp:3439
size_t MinEncodedSize(Signedness sign=UNSIGNED) const
Minimum number of bytes to encode this integer.
Definition: integer.cpp:3393
unsigned int BitCount() const
Determines the number of bits required to represent the Integer.
Definition: integer.cpp:3345
Integer & operator|=(const Integer &t)
Bitwise OR Assignment.
Definition: integer.cpp:4067
void Negate()
Reverse the Sign of the Integer.
Definition: integer.cpp:4336
bool NotNegative() const
Determines if the Integer is non-negative.
Definition: integer.h:339
unsigned int WordCount() const
Determines the number of words required to represent the Integer.
Definition: integer.cpp:3331
static const Integer & One()
Integer representing 1.
Definition: integer.cpp:4877
Integer & operator^=(const Integer &t)
Bitwise XOR Assignment.
Definition: integer.cpp:4088
Integer & operator=(const Integer &t)
Assignment.
Definition: integer.cpp:3091
Integer & operator-=(const Integer &t)
Subtraction Assignment.
Definition: integer.cpp:4006
RandomNumberType
Properties of a random integer.
Definition: integer.h:91
@ ANY
a number with no special properties
Definition: integer.h:93
@ PRIME
a number which is probabilistically prime
Definition: integer.h:95
bool operator!() const
Negation.
Definition: integer.cpp:3086
Integer AbsoluteValue() const
Retrieve the absolute value of this integer.
Definition: integer.cpp:3162
Signedness
Used when importing and exporting integers.
Definition: integer.h:83
@ SIGNED
a signed value
Definition: integer.h:87
@ UNSIGNED
an unsigned value
Definition: integer.h:85
Integer Xor(const Integer &t) const
Bitwise XOR.
Definition: integer.cpp:3835
int Compare(const Integer &a) const
Perform signed comparison.
Definition: integer.cpp:4354
Integer Modulo(const Integer &b) const
Remainder.
Definition: integer.cpp:4244
size_t OpenPGPEncode(byte *output, size_t bufferSize) const
Encode absolute value in OpenPGP format.
Definition: integer.cpp:3471
void swap(Integer &a)
Swaps this Integer with another Integer.
Definition: integer.cpp:3169
static void DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
Extended Division.
Definition: integer.cpp:4209
bool IsZero() const
Determines if the Integer is 0.
Definition: integer.h:330
Integer MultiplicativeInverse() const
Calculate multiplicative inverse.
Definition: integer.cpp:4401
bool GenerateRandomNoThrow(RandomNumberGenerator &rng, const NameValuePairs &params=g_nullNameValuePairs)
Generate a random number.
Definition: integer.cpp:3567
bool IsNegative() const
Determines if the Integer is negative.
Definition: integer.h:336
void Decode(const byte *input, size_t inputLen, Signedness sign=UNSIGNED)
Decode from big-endian byte array.
Definition: integer.cpp:3354
Integer & operator<<=(size_t n)
Left-shift Assignment.
Definition: integer.cpp:4029
static Integer Power2(size_t e)
Exponentiates to a power of 2.
Definition: integer.cpp:3079
Sign
Used internally to represent the integer.
Definition: integer.h:73
@ NEGATIVE
the value is negative
Definition: integer.h:77
@ POSITIVE
the value is positive or 0
Definition: integer.h:75
unsigned int ByteCount() const
Determines the number of bytes required to represent the Integer.
Definition: integer.cpp:3336
bool IsUnit() const
Determine if 1 or -1.
Definition: integer.cpp:4396
bool IsOdd() const
Determines if the Integer is odd parity.
Definition: integer.h:351
static const Integer & Two()
Integer representing 2.
Definition: integer.cpp:4889
void Encode(byte *output, size_t outputLen, Signedness sign=UNSIGNED) const
Encode in big-endian format.
Definition: integer.cpp:3410
Integer InverseMod(const Integer &n) const
Calculate multiplicative inverse.
Definition: integer.cpp:4430
Integer SquareRoot() const
Extract square root.
Definition: integer.cpp:4372
bool IsEven() const
Determines if the Integer is even parity.
Definition: integer.h:348
An invalid argument was detected.
Definition: cryptlib.h:203
void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Definition: integer.cpp:3545
Ring of congruence classes modulo n.
Definition: modarith.h:39
Integer & Reduce(Integer &a, const Integer &b) const
TODO.
Definition: integer.cpp:4595
ModularArithmetic(const Integer &modulus=Integer::One())
Construct a ModularArithmetic.
Definition: modarith.h:49
const Integer & Half(const Integer &a) const
Divides an element by 2.
Definition: integer.cpp:4527
const Integer & Inverse(const Integer &a) const
Inverts the element in the ring.
Definition: integer.cpp:4612
void BERDecodeElement(BufferedTransformation &in, Element &a) const
Decodes element in DER format.
Definition: integer.cpp:4522
unsigned int MaxElementByteLength() const
Provides the maximum byte size of an element in the ring.
Definition: modarith.h:232
void DEREncodeElement(BufferedTransformation &out, const Element &a) const
Encodes element in DER format.
Definition: integer.cpp:4517
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the ring.
Definition: integer.cpp:4635
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Definition: integer.cpp:4624
Integer & Accumulate(Integer &a, const Integer &b) const
TODO.
Definition: integer.cpp:4558
const Integer & Subtract(const Integer &a, const Integer &b) const
Subtracts elements in the ring.
Definition: integer.cpp:4578
const Integer & Add(const Integer &a, const Integer &b) const
Adds elements in the ring.
Definition: integer.cpp:4538
void DEREncode(BufferedTransformation &bt) const
Encodes in DER format.
Definition: integer.cpp:4509
Performs modular arithmetic in Montgomery representation for increased speed.
Definition: modarith.h:275
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the Ring.
Definition: modarith.h:309
Integer ConvertOut(const Integer &a) const
Reduces an element in the congruence class.
Definition: integer.cpp:4685
const Integer & Square(const Integer &a) const
Square an element in the ring.
Definition: integer.cpp:4672
Integer ConvertIn(const Integer &a) const
Reduces an element in the congruence class.
Definition: modarith.h:292
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Definition: modarith.h:306
const Integer & Multiply(const Integer &a, const Integer &b) const
Multiplies elements in the ring.
Definition: integer.cpp:4659
MontgomeryRepresentation(const Integer &modulus)
Construct a MontgomeryRepresentation.
Definition: integer.cpp:4648
const Integer & MultiplicativeInverse(const Integer &a) const
Calculate the multiplicative inverse of an element in the ring.
Definition: integer.cpp:4698
Interface for retrieving values given their names.
Definition: cryptlib.h:294
T GetValueWithDefault(const char *name, T defaultValue) const
Get a named value.
Definition: cryptlib.h:363
bool GetValue(const char *name, T &value) const
Get a named value.
Definition: cryptlib.h:350
bool GetIntValue(const char *name, int &value) const
Get a named value with type int.
Definition: cryptlib.h:386
Object Identifier.
Definition: asn.h:167
P1363 key derivation function.
Definition: pubkey.h:730
Application callback to signal suitability of a cabdidate prime.
Definition: nbtheory.h:114
Interface for random number generators.
Definition: cryptlib.h:1384
virtual void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Definition: cryptlib.cpp:311
Secure memory block with allocator and cleanup.
Definition: secblock.h:689
iterator begin()
Provides an iterator pointing to the first element in the memory block.
Definition: secblock.h:772
void CleanNew(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:980
void swap(SecBlock< T, A > &b)
Swap contents with another SecBlock.
Definition: secblock.h:1041
void CleanGrow(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1013
void Grow(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:995
void New(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:965
size_type size() const
Provides the count of elements in the SecBlock.
Definition: secblock.h:797
void resize(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1031
SecBlock<byte> typedef.
Definition: secblock.h:1058
Restricts the instantiation of a class to one static object without locks.
Definition: misc.h:264
Square block cipher.
Definition: square.h:25
String-based implementation of Store interface.
Definition: filters.h:1196
Pointer that overloads operator ->
Definition: smartptr.h:37
Library configuration file.
Functions for CPU features and intrinsics.
bool HasSSE2()
Determines SSE2 availability.
Definition: cpu.h:116
bool IsP4()
Determines if the CPU is an Intel P4.
Definition: cpu.h:236
ByteOrder
Provides the byte ordering.
Definition: cryptlib.h:143
@ LITTLE_ENDIAN_ORDER
byte order is little-endian
Definition: cryptlib.h:145
@ BIG_ENDIAN_ORDER
byte order is big-endian
Definition: cryptlib.h:147
Implementation of BufferedTransformation's attachment interface.
Multiple precision integer with arithmetic operations.
Utility functions for the Crypto++ library.
const T & STDMAX(const T &a, const T &b)
Replacement function for std::max.
Definition: misc.h:578
unsigned int BitPrecision(const T &value)
Returns the number of bits required for a value.
Definition: misc.h:754
unsigned int BytePrecision(const T &value)
Returns the number of 8-bit bytes or octets required for a value.
Definition: misc.h:731
size_t BitsToWords(size_t bitCount)
Returns the number of words required for the specified number of bits.
Definition: misc.h:870
unsigned int TrailingZeros(word32 v)
Determines the number of trailing 0-bits in a value.
Definition: misc.h:779
void ConditionalSwapPointers(bool c, T &a, T &b)
Performs a branchless swap of pointers a and b if condition c is true.
Definition: misc.h:1269
T Crop(T value, size_t bits)
Truncates the value to the specified number of bits.
Definition: misc.h:838
bool IsPowerOf2(const T &value)
Tests whether a value is a power of 2.
Definition: misc.h:921
const T & STDMIN(const T &a, const T &b)
Replacement function for std::min.
Definition: misc.h:567
size_t BitsToBytes(size_t bitCount)
Returns the number of 8-bit bytes or octets required for the specified number of bits.
Definition: misc.h:850
#define MEMORY_BARRIER
A memory barrier.
Definition: misc.h:227
size_t BytesToWords(size_t byteCount)
Returns the number of words required for the specified number of bytes.
Definition: misc.h:860
void PutWord(bool assumeAligned, ByteOrder order, byte *block, T value, const byte *xorBlock=NULL)
Access a block of memory.
Definition: misc.h:2428
T1 SaturatingSubtract1(const T1 &a, const T2 &b)
Performs a saturating subtract clamped at 1.
Definition: misc.h:1021
Class file for performing modular arithmetic.
Crypto++ library namespace.
const char * PointerToPrimeSelector()
const PrimeSelector *
Definition: argnames.h:42
const char * Seed()
ConstByteArrayParameter.
Definition: argnames.h:19
Classes and functions for number theoretic operations.
ASN.1 object identifiers for algorthms and schemes.
Precompiled header file.
This file contains helper classes/functions for implementing public key algorithms.
Classes and functions for secure memory allocations.
Classes for SHA-1 and SHA-2 family of message digests.
Classes for automatic resource management.
Common C++ header files.
#define CRYPTOPP_ASSERT(exp)
Debugging and diagnostic assertion.
Definition: trap.h:69