High Performance Cryptographic Hardware using Pipelined Data Encryption Standard Units
Original Publication Date: 1995-Jan-01
Included in the Prior Art Database: 2005-Mar-29
Butter, A: AUTHOR [+5]
Disclosed is a method used to achieve significant performance improvement for the Data Encryption Standard CBC Decipher function while abiding by practical constraints in implementing parallel Data Encryption Standard (DES) [1,2] hardware on a single VLSI chip.
High Performance Cryptographic Hardware using Pipelined
a method used to achieve significant performance
improvement for the Data Encryption Standard CBC Decipher function
while abiding by practical constraints in implementing parallel Data
Encryption Standard (DES) [1,2] hardware on a single VLSI chip.
It is assumed
that the DES Units used have a two-stage pipeline
so that once a DES Unit operation is initiated, another operation may
be started on the same DES Unit one cycle before the preceding
operations result is available. For example, if operation 1 is
initiated on DES Unit 1 and it takes five cycles to produce a DES
result, then operation 2 may be initiated on DES Unit 1 in processing
cycle 4 for operation 1.
multiple DES Unit CBC Decipher data flow is shown
in Fig. 1. In this configuration, a common input bus (DES_DATA_IN)
is used to provide input data to all DES units from a common input
buffer. This same bus also supplies data to the Chain Value Buffer
(CV BUFFER). All DES unit output buses are multiplexed to form a
bus (DES_DATA_OUT). A single exclusive-OR stage is used to generate
CBC Decipher results, which are stored in a common output buffer.
organization described is typically referred to as Single
Instruction-Single Data (SISD). By providing common input/output
buffers and shared bus structures, the SISD data flow organization
resolves the scalability problems inherent in the design of parallel
In order to
increase CBC Decipher performance using the SISD
data flow model, pipelining is employed across all DES Units. Two
parameters determine the upper performance bound for CBC Decipher
operations in this environment:
o The number of clock cycles n required to produce one DES
o The amount of skew s which is introduced between successive
o The actual speed-up attainable for the CBC Decipher function is
determined by the following formula:
o Speed-Up = mod|(n-1)/s|
formula, the numerator n-1 represents the maximum
number of DES Units which can be utilized in performing pipelined CBC
Decipher operations. The -1 factor relates to the earlier assumption
made about pipelining of operations on a single DES Unit.
Furthermore, the denominator s indicates that maximum speed-up is
attained whenever the skew between successive decipher operations is
one DES Unit clock cycle. As an example, consider the case where
each DES Unit takes 5 clock cycles to produce one decipher result.
If the skew between successive decipher operations is one clock
cycle, then the maximum speed-up provided is:
Speed-Up(max = mod|(5-1)/1|=4
of the Chain Value Buffer is shown in Fig. 2.
Each pair of registers (HOLD i REG,CVi REG) is associated with one
particular DES Unit (D...