Design Details
--------------

This section goes into a few of the more obscure details not covered in the
section on security analysis, such as the encryption algorithm used by SFS, the
generation of random numbers, the handling of initialization vectors (IV's),
and a brief overview on the deletion of sensitive information retained in
memory after a program has terminated (this is covered in more detail in the
section "Security Analysis" above).


The Encryption Algorithm used in SFS

Great care must be taken when choosing an encryption algorithm for use in
security software.  For example, the standard Unix crypt(1) command is based on
a software implementation of a rotor machine encryption device of the kind
which was broken by mathematicians using pencil and paper (and, later on, some
of the first electronic computers) in the late 1930's[1].  Indeed, there exists
a program called `crypt breaker's workbench' which allows the automated
breaking of data encrypted using the crypt(1) command[2].  The insecurity of
various other programs has been mentioned elsewhere.  It is therefore
imperative that a reliable encryption system, based on world-wide security
standards, and easily verifiable by consulting these standards, be used.

When a block cipher is used as a stream cipher by running it in CFB (cipher
feedback) mode, there is no need for the cipher's block transformation to be a
reversible one as it is only ever run in one direction (generally
encrypt-only).  Therefore the use of a reversible cipher such as DES or IDEA is
unnecessary, and any secure one-way block transformation can be substituted.
This fact allows the use of one-way hash functions, which have much larger
block sizes (128 or more bits) and key spaces (512 or more bits) than most
reversible block ciphers in use today.

The transformation involved in a one-way hash function takes an initial hash
value H and a data block D, and hashes it to create a new hash value H':

    hash( H, D ) -> H'

or, more specifically, in the function used in SFS:

    H + hash( D ) -> H'

This operation is explained in more detail in FIPS Publication 180 and ANSI
X9.30 part 2, which define the Secure Hash Standard.  By using H as the data
block to be encrypted and D as the key, we can make the output value H'
dependant on a user-supplied key. That is, when H is the plaintext, D is the
encryption key, and H' is the ciphertext:

     plaintext H
         |
         v
    +---------+
    |   SHS   |<- key D
    +---------+
         |
         v
    ciphertext H'

If we regard it as a block cipher, the above becomes:

    H' = SHS( H )

which is actually:

    C  =   e( P )

Since we can only ever "encrypt" using a one-way hash function, we need to run
the "cipher" in cipher feedback mode, which doesn't require a reversible
encryption algorithm.

By the properties of the hash function, it is computationally infeasible to
either recover the key D or to control the transformation H -> H' (in other
words given a value for H' we cannot predict the H which generated it, and
given control over the value H we cannot generate an arbitrary H' from it).

The MDC encryption algorithm is a general encryption system which will take any
one-way hash function and turn it into a stream cipher running in CFB mode. The
recommended one-way hash function for MDC is the Secure Hash Standard as
specified in Federal Information Processing Standards (FIPS) Publication 180
and ANSI X9.30 part 2.  SHS is used as the block transformation in a block
cipher run in CFB mode as detailed in AS 2805.5.2 section 8 and ISO 10116:1991
section 6, with the two parameters (the size of the feedback and plaintext
variables) j and k both being set to the SHS block size of 160 bits.  The
properties of this mode of operation are given in Appendix A3 of AS 2805.5.2
and Annex A.3 of ISO 10116:1991.  The CFB mode of operation is also detailed in
a number of other standards such as FIPS Publication 81 and USSR Government
Standard GOST 28147-89, Section 4.  The use of an initialization vector (IV) is
as given in ISO 10126-2:1991 section 4.2, except that the size of the IV is
increased to 160 bits from the 48 or 64 bits value given in the standard. This
is again detailed in a number of other standards such as GOST 28147-89 Section
3.1.2. The derivation of the IV is given in the section "Encryption
Considerations" below.

The key setup for the MDC encryption algorithm is performed by running the
cipher over the encryption key (in effect encrypting the key with MDC using
itself as the key) and using the encrypted value as the new encryption key.
This procedure is then repeated a number of times to make a "brute-force"
decryption attack more difficult, as per the recommendation in the Public-Key
Cryptography Standard (PKCS), part 1.  This reduces any input key, even one
which contains regular data patterns, to a block of white noise equal in size
to the MDC key data.

The exact key scheduling process for MDC is as follows:

 Initialization:

 - The SHS hash value H is set to the key IV[3].
 - The SHS data block D is set to all zeroes.
 - The key data of length 2048 bits is set to a 16-bit big-endian value
   containing the length of the user key in bytes, followed by up to 2032 bits
   of user key.

   SHS hash value H = key IV;
   SHS data block D = zeroes;
   key_data [0:15] = length of user key in bytes;
   key_data [16:2047] = user key, zero-padded;

 Key schedule:

 - The following process is iterated a number of times:

   - The 2048-bit key data block is encrypted using MDC.
   - Enough of the encrypted key data is copied from the start of the key data
     block into the SHS data block D to fill it.

   for i = 1 to 200 do
       encrypted_key = encrypt( key_data );
       D = encrypted_key;

During the repeated encryptions, the IV is never reset.  This means that the IV
from the end of the n-1 th data block is re-injected into the start of the n th
data block.  After 200 iterations, the "randomness" present in the key has been
diffused throughout the entire key data block.

Although the full length of the key data block is 2048 bits, the SHS algorithm
only uses 512 bits of this (corresponding to the SHS data block D) per
iteration.  The remaining 1536 bits take part in the computation (by being
carried along via the IV) but are not used directly.  By current estimates
there are around 2^256 atoms in the universe.  Compiling a table of all 2^512
possible keys which would be necessary for a brute-force attack on MDC would
therefore be a considerable challenge to an attacker, requiring, at least, the
creation of another 512 * 2^256 universes to hold all the keys.  Even allowing
for the current best-case estimate of a creation time of 7 days per universe,
the mere creation of storage space for all the keys would take an unimaginably
large amount of time.

The SFS key schedule operation has been deliberately designed to slow down
special hardware implementations, since the encryption algorithm is rekeyed
after each iteration.  Normal high-speed password-cracking hardware would (for
example, with DES) have 16 separate key schedules in a circular buffer, each
being applied to a different stage of a 16-stage pipeline (one stage per DES
round) allowing a new result to be obtained in every clock cycle once the
pipeline is filled.  In MDC the key data is reused multiple times during the 80
rounds of SHS, requiring 80 separate key schedules for the same performance as
the 16 DES ones.  However since the algorithm is rekeyed after every iteration
for a total of 200 iterations, this process must either be repeated 200 times
(for a corresponding slowdown factor of 200), or the full pipeline must be
extended to 16,000 stages to allow the one-result-per-cycle performance which
the 16-stage DES pipeline can deliver (assuming the rekeying operation can be
performed in a single cycle - in fact the 1024-bit keys used by MDC/SHS need to
be processed in 7 sequential 160-bit blocks due to SHS's block size, stretching
the pipeline to 112,000 stages).  Changing the iteration count to a higher
value will further slow down this process.

The number of iterations of key encryption is controlled by the user, and is
generally done some hundreds of times.  The setup process in SFS has been tuned
to take approximately half a second on a workstation rated at around 15 MIPS
(corresponding to 200 iterations of the encryption process), making a
brute-force password attack very time-consuming.  Note that the key IV is
injected at the earliest possible moment in the key schedule rather than at the
very end, making the use of a precomputed data attack impossible.  The standard
method of injecting the encryption IV at the end of the key schedule process
offers very little protection against an attack using precomputed data, as it
is still possible to precompute the key schedules and simply drop in the
encryption IV at the last possible moment.

Footnote [1]: This is covered in a number of books, for example Welchman's "The
              Hut Six Story: Breaking the Enigma Codes", New York, McGraw-Hill
              1982, and Kahns "Seizing the Enigma", Boston, Houghton-Mifflin
              1991.

Footnote [2]: Available from ftp.ox.ac.uk in the directory /src/security as
              cbw.tar.Z.

Footnote [3]: Some sources would refer to this value as a `salt'.  The term
              `key IV' is used here as this is probably a more accurate
              description of its function.


Generating Random Numbers

One thing which cryptosystems consume in large quantities are random numbers.
Not just any old random value, but cryptographically strong random numbers.  A
cryptographically strong random value is one which cannot be predicted by an
attacker (if the attacker can predict the values which are used to set up
encryption keys, then they can make a guess at the encryption key itself).
This automatically rules out all software means of generating random values,
and means specialised hardware must be used.

Very few PC's are currently equipped with this type of hardware.  However SFS
requires 1024 random bits for each encrypted disk, in the form of the disk key
(see the section "Password Lifetimes and Scope" above).  SFS therefore uses a
number of sources of random numbers, both ones present in the hardware of the
PC and one external source:

  - Various hardware timers which are read occasionally when the program is
    running (generally after operations which are long and complex and will be
    heavily affected by external influences such as interrupts, video, screen,
    and disk I/O, and other factors.

  - The contents and status information of the keyboard buffer

  - Disk driver controller and status information

  - Mouse data and information

  - Video controller registers and status information

[ - The clock skew between two hardware clocks available on the PC.  Due to
    background system activity such as interrupt servicing, disk activity, and
    variable-length instruction execution times, these clocks run out-of-phase.
    SFS uses this phase difference as a source of random numbers.  NB: Not
    implemented yet].

  - The timing of keystrokes when the password is entered.  SFS reads the
    high-speed 1.19 MHz hardware timer after each keystroke and uses the timer
    values as a source of random numbers.  This timer is used both to measure
    keystroke latency when the password is entered and read at random times
    during program execution.  Trials have shown that this 16-bit counter
    yields around 8 bits of random information (the exact information content
    is difficult to gauge as background interrupts, video updates, disk
    operations, protected-mode supervisor software, and other factors greatly
    affect any accesses to this counter, but an estimate of 8 bits is probably
    close enough[1]).

  - The timing of disk access latency for random disk reads.  The exact
    operation is as follows:

        Read a timer-based random disk sector
        Add its contents (8 bits)
        Read the high-speed 1.19 MHz hardware timer (13 bits)
        Use the two values for the next random sector

    This is repeated as often as required (in the case of SFS this is 10
    times).  Assuming a (currently rather optimistic) maximum of 5ms to acquire
    a sector this provides about 13 bits of randomness per disk operation.  The
    number of factors which influence this value is rather high, and includes
    among other things the time it takes the BIOS to process the request, the
    time it takes the controller to process the request, time to seek to the
    track on the disk, time to read the data (or, if disk cacheing is used,
    time for the occasional cache hit), time to send it to the PC, time to
    switch in and out of protected mode when putting it in the cache, and of
    course the constant 3-degree background radiation of other interrupts and
    system activity happening at the same time.  If a solid-state disk were
    being used, the hardware latency would be significantly reduced, but
    currently virtually no 386-class PC's have solid-state disks (they're
    reserved for palmtops and the like), so this isn't a major concern.

An estimate of the number of random bits available from each source is as
follows:

    Keystroke latency, 8 bits per key      80 bits for minimum 10-char key
    Second password entry for encryption   80 bits for minimum 10-char key
    Disk access latency, 13 bits per read 130 bits for 10 reads
    Disk sector data, 8 bits               80 bits for 10 reads
    System clocks and timers                3 bits
    Video controller information            4 bits
    Keyboard buffer information             4 bits
    Disk status information                 4 bits
    General system status                   4 bits
    Random high-speed timer reads         120 bits for 15 reads
                                          --------
    Total                                 509 bits

These figures are very conservative estimates only, and are based on timing
experiments with typed-in passwords and a careful scrutiny of the PC's hardware
and system status data.  For example, although the time to access a disk sector
for a particular drive may be 10ms or more, the actual variation on that 10ms
may only be +/- 2ms.  The figures given above were taken by averaging the
variation in times for large numbers of tests.  In practice (especially with
longer passwords) the number of random bits is increased somewhat (for example
with a 30-character password the total may be as high as 829 bits of random
information).  However even the minimal estimate of 509 bits is adequate for
the 512-bit key required by MDC.

Each quantum of semi-random information is exclusive-ored into a 1024-bit
buffer which is initially set to all zeroes.  Once 1024 bits of buffer have
been filled, the data is encrypted with MDC to distribute the information, and
the first 512 bits of the 1024-bit buffer is used as the key for the next MDC
encyrption pass.  Then more data is added until, again, 1024 bits of buffer
have been filled, whereupon the data is again mixed by encrypting it with MDC.
This process is repeated several times, with the amount of "randomness" in the
buffer increasing with each iteration.

Before being used, this data is encrypted 10 times over with MDC to ensure a
complete diffusion of randomness.  Since the IV for the encryption is reused
for the next pass through the buffer, any information from the end of the
buffer is thus reinjected at the start of the buffer on the next encryption
pass.

Although this method of generating random numbers is not perfect, it seems to
be the best available using the existing hardware.  General estimates of the
exact amount of truly random information which can be acquired in this manner
are in the vicinity of several hundred bits.  Proposed attacks all make the
assumption that an attacker is in possession of what amounts to a complete
hardware trace of events on the machine in question.  Allowing for a reasonable
amount of physical system security, it can be assumed that the random data used
in SFS is unpredictable enough to provide an adequate amount of security
against all but the most determined attacker.

Footnote [1]: If an opponent can obtain several hours of keystroke timings and
              can come up with a good model including serial correlations, they
              may be able to reduce the likely inputs to the random number
              accumulator to a somewhat smaller value, or at least bias their
              guesses to fall within the range of likely values.


Encryption Considerations

When a block cipher is converted to handle units of data larger than its
intrinsic block size, a number of weaknesses can be introduced, depending on
the mode of operation which is chosen for the block cipher.  For example, if
two identical ciphertext blocks are present in different locations in a file,
this may be used to determine the plaintext.  If we can find two identical
blocks of ciphertext when cipher block chaining (CBC) is used, then we know
that:

    P[ i ] = d( C[ i ] ) ^ C[ i-1 ]
    P[ j ] = d( C[ j ] ) ^ C[ j-1 ]

where C is the ciphertext, P is the plaintext, and e() and d() are encryption
and decryption respectively.  Now if C[ i ] = C[ j ], then d( C[ i ] ) =
d( C[ j ] ), which cancel out when xor'd so that:

    P[ i ] ^ C[ i-1 ] = P[ j ] ^ C[ j-1 ]

or:

     P[ j ] = P[ i ] ^ C[ i-1 ] ^ C[ j-1 ]

Knowing C[ i ] and C[ j ] we can determine P[ i ] and P[ j ], and knowing
either P[ i ] or P[ j ] we can determine the other.

Something similar holds when cipher feedback (CFB) mode is used, except that
now the decryption operation is:

    P[ i ] = e( C[ i-1 ] ) ^ C[ i ]
    P[ j ] = e( C[ j-1 ] ) ^ C[ j ]

Now if C[ i ] = C[ j ] then e( C[ i ] ) = e( C[ j ] ) (recall that in CFB mode
the block cipher is only ever used for encryption), so that they again cancel
out, so:

    P[ i ] ^ e( C[ i-1 ] ) = P[ j ] ^ e( C[ j-1 ] )

or:

    P[ i ] = P[ j ] ^ e( C[ i-1 ] ) ^ e( C[ j-1 ] )

In general this problem is of little consequence since the probability of
finding two equal blocks of ciphertext when using a 160-bit block cipher on a
dataset of any practical size is negligible.  More esoteric modes of operation
such as plaintext feedback (PFB) and ones whose acronyms have more letters than
Welsh place names tend to have their own special problems and aren't considered
here.

The problem does become serious, however, in the case of sector-level
encryption, where the initialization vector cannot be varied.  Although the IV
may be unique for each sector, it remains constant unless special measures such
as reserving extra storage for sector IV's which are updated with each sector
write are taken.  If a sector is read from disk, a small change made to part of
it (for example changing a word in a text file), and the sector written out to
disk again, several unchanged ciphertext/plaintext pairs will be present,
allowing the above attack to be applied.  However, there are cases in which
this can be a problem.  For example, running a program such as a disk
defragmenter will rewrite a large number of sectors while leaving the IV
unchanged, allowing an opponent access to large quantities of XOR'd plaintext
blocks simply by recording the disk contents before and after the
defragmentation process.  Normally this problem would be avoided by using a
different IV for each encrypted message, but most disk systems don't have the
room to store an entire sectors worth of data as well as the IV needed to
en/decrypt it.

An additional disadvantage of the CFB encryption mode is that the data in the
last block of a dataset may be altered by an attacker to give different
plaintext without it affecting the rest of the block, since the altered
ciphertext in the last block never enters the feedback loop.  This type of
attack requires that an opponent possess at least two copies of the ciphertext,
and that they differ only in the contents of the last block.  In this case the
last ciphertext block from one copy can be subsituted for the last ciphertext
block in the other copy, allowing a subtle form of message modification attack.
In fact in combination with the previously mentioned weakness of CFB, an
attacker can determine the XOR of the plaintexts in the last block and
substitute an arbitrary piece of "encrypted" plaintext to replace the existing
data.

There are several approaches to tackling this problem.  The most simplistic one
is to permute the plaintext in a key-dependant manner before encryption and
after decryption.  This solution is unsatisfactory as it simply shuffles the
data around without necessarily affecting any particular plaintext or
ciphertext block.  The desired goal of a change in any part of the plaintext
affecting the entire dataset is not achieved.

A better solution is to encrypt data twice, once from front to back and then
from back to front[1].  The front-to-back pass propagates any dependencies to
the back of the dataset, and the back-to-front pass propagates dependencies
back to the front again.  In this way a single change in any part of the
plaintext affects the entire dataset.  The disadvantage of this approach is
that it at least halves the speed of the encryption, as all data must be
encrypted twice. If the encryption is done in software, this may create an
unacceptable loss of throughput.  Even with hardware assistance there is a
noticeable slowdown, as no hardware implementations easily support backwards
encryption, requiring the data to be reversed in software before the second
pass is possible.

The best solution is probably to use a word-wise scrambler polynomial like the
one used in SHA.  With a block of plaintext P this is:

    P[ i ] ^= P[ i-K1 ] ^ P[ i-K2 ]

with suitable values for the constants K1 and K2.  If K2 is chosen to be 5 (the
SHA block size in words) then the initial values of the 5 words (which can be
thought of as as P[ -5 ]...P[ -1 ]) are simply the sectorIV.  The value of K1
is arbitrary, SFS uses a value of 4.

This technique is used by first setting the initial values of the 5 words to
the sectorIV.  The scrambler function is then run over the entire data block,
propagating all dependencies to the last 5 words in the block.  These last 5
words are then used as the IV for the CFB encryption of the entire block.  In
this way the encryption IV depends on all the bits in the block, and the
scrambling does a moderately good job of breaking up statistical patterns in
the plaintext.  No information is lost, so no randomness in the sectorIV is
misplaced.

This also provides resistance to the selective-modification attack which allows
an attacker to change selected bits in the last block of a CFB-encrypted
dataset without damage.  By destroying the IV used in the CFB encryption, the
first block is completely corrupted, which is unlikely to go unnoticed.

To decrypt a dataset encrypted in this manner, the first 5 words of ciphertext
are shifted into the feedback path, and the remainder of the dataset is
decrypted in the standard manner.  The last 5 decrypted words are then used as
the IV to decrypt the first encrypted block.  Finally, the scrambler is run
over the recovered plaintext to undo the changes made during the encryption
scrambling.

The overall en/decryption process used by SFS, in the case of 512-byte sectors
and 32-bit words (so that each sector contains 128 words), is:

    Encryption:

        using sectorIV[ 0 ]...sectorIV[ 4 ] as the scrambler IV
            scramble data[ 0 ]...data[ 127 ]
        using data[ 127-5 ]...data[ 127-1 ] as the encryption IV
            encrypt data[ 0 ]...data[ 127 ]

    Decryption:

        using data[ 0 ]...data[ 4 ] as the encryption IV
            decrypt data[ 5 ]...data[ 127 ]
        using data[ 127-5 ]...data[ 127-1 ] as the encryption IV
            decrypt data[ 0 ]...data[ 4 ]
        using sectorIV[ 0 ]...sectorIV[ 4 ] as the scrambler IV
            scramble data[ 0 ]...data[ 127 ]

where the scrambling operation is:

        data[ i ] ^= data[ i-4 ] ^ data[ i-5 ]

as outlined above.  Note that the i-4 and i-5 th values referred to here are
the original, scrambled values, not the descrambled values.  The easiest way to
implement this is to cache the last 5 scrambled values and cyclically overwrite
them as each word in the data buffer is processed.

Footnote [1]:  To be precise, you need some sort of feedback from the end of
               a block on the first encryption pass to the start of the block
               on the next encryption pass.  A block can be encrypted forwards
               twice as long as the IV is wrapped back to the start of the
               block for the second encryption pass.


A Discussion of the MDC Encryption Algorithm[1]

(A word on notation:  The notation {0,1}^k is used to mean the set of all bit
strings of length k, and {0,1}* means the set of all bit strings, including the
empty string.  Any message can be viewed as a bit string by means of a suitable
encoding).

The encryption method used by SFS is somewhat unusual, and in some respects is
similar to Merkle's "Meta Method" for obtaining cryptosystems[2].  The method
relies on the existence of secure one-way hash functions.  A hash function is a
function that takes as input an arbitrary number of bits and produces a
fixed-sized output called the "message digest".  In other words, hash functions
have the form

    h : {0,1}* --> {0,1}^k              for some fixed k,

and the hash of a message M is defined to be h( M ).  A secure one-way hash
function is a hash function with the following properties:

    1. For each message M, it is easy to compute h( M ).

    2. Given M, it is computationally infeasible to compute M' with
       h( M ) = h( M' ) (secure against forgery).

    3. It is computationally infeasible to compute M and M' with
       h( M ) = h( M' ) (secure against collisions).

For a good, but rather technical, discussion of hash functions, see
"Contemporary Cryptology. The Science of Information Integrity" edited by
Gustavus Simmons, IEEE Press, 1992 (ISBN 0-87942-277-7).

The terms "easy to compute" and "infeasible to compute" can be given more
precise definitions, but we'll settle for this informal terminology for now.
Roughly speaking, "easy to compute" means that it will take a tolerable amount
of time to compute the answer, even on a rather small machine; "infeasible to
compute" means that it should take eons to find out a particular result, even
when using millions of computers of the fastest conceivable technology in
parallel.

Examples of hash functions include the MD2, MD4, and MD5 hash functions,
developed by Ron Rivest of RSA Data Security, Inc., which have been (at least
in the case of MD4 and MD5) placed in the public domain, and the Secure Hash
Standard SHS, developed by NIST (with significant input from the NSA).  The
existence of secure one-way hash functions has not been proven, although there
exist some strong candidates, including MD5 and SHS.

The reference implementations of the above hashing functions include one
interesting aspect which makes it possible to use them as encryption functions.
Since the hashing of a very large amount of data in one sweep is not desirable
(because all the data would have to be in memory at the time of hashing), most
hashing algorithms allow data to be hashed incrementally.  This is made
possible by augmenting the definition of a hash function to include the state
of the last hashing operation.  In other words, a hash function now has the
form

    h : {0,1}^k x {0,1}* --> {0,1}^k,

where the first argument is the previous hash value, and the hash of a message
M = ( M1, M2, ..., Mn ) is defined to be

    h( h( ...( h( h0, M1 ), M2 ), ... ), Mn ).

(The value of all the h evaluations must not change if the message is broken up
into blocks of different lengths, but all of the previously mentioned hash
functions have that property).  Here, h0 is a fixed, known initial value that
is used in all hashing calculations.

This is not the way "real" hash functions behave, but it is close enough.  For
example, the MD5 hashing function has "initialization", "updating", and
"finalization" parts, where the finalization part appends the number of hashed
bytes to the message, hashes one final time, and returns the final hash value.
This means that the hashing "context" must include the number of bytes hashed
so far, without it being a part of the hash value.  The hash function can be
said to have "memory".

If we assume that h is a secure one-way hashing function, we can now use such
an h as a scrambling device.  For example, if we set E( M ) = h( h0, M ) for
every message M, M will not be recoverable from E( M ), because h is secure by
definition.  Another method would be to supply M to any standard MSDOS or UNIX
utility and use the resulting error message as the ciphertext (remembering that
a computer is a device for turning input into error messages).  However, there
are still two problems to be solved before we can use hash functions as
encryption functions:

    1. The scrambling process is not controlled by a key.

    2. The scrambling process is not invertible, so there is no way to
       decrypt the ciphertext.

Both problems can be solved by interchanging the roles of hash and data and by
using CFB mode in the encryption process.  In other words, let K be an
arbitrarily long key, let M = ( M1, ..., Mn ) be a message, broken up into
chunks of k bits, let IV be an initialization vector, and set

    C1 = M1 xor h( IV, K )
    Ci = Mi xor h( C( i-1 ), K )        for 1 < i <= n.

This is sent to the recipient, who easily recovers the plaintext by

    P1 = C1 xor h( IV, K )
    Pi = Ci xor h( C( i-1 ), K )                for 1 < i <= n,

since we have

    P1 = ( M1 xor h( IV, K ) ) xor h( IV, K )
       = M1 xor ( h( IV, K ) xor h( IV,K ) )    because xor is associative,
       = M1 xor 0,                              because x xor x = 0,
       = M1,                                    because x xor 0 = x,

and similarly for the Pi's.  This method of encryption also offers more
security than using ECB mode, assuming that this were possible with hash
functions, since the plaintext is diffused over the entire ciphertext,
destroying plaintext statistics, and thus foiling straightforward ciphertext
correlation attacks.

This method can clearly be used for any hash function which can hash
incrementally.  Thus, it is a "Meta Method" for turning hash functions into
encryption functions.  This is called the Message Digest Cipher (MDC) method of
encryption.  Specific instances of the method have the name of the hash
function added as a suffix.  For example, the MDC method applied to the MD5
hash function would be referred to as MDC/MD5.  SFS uses MDC/SHS.

Having analysed the inner workings of MDC, at least one theoretical attack on
the algorithm should be mentioned.  There are certain properties of hash
functions which may make them unsuitable for use as encryption algorithms.  For
example suppose knowledge of a 160-bit input/output pair to SHS leaks a
fraction of a bit of information about the data being hashed, maybe a quarter
of a bit.  This allows a search of 2^159.75 data blocks to find another data
block that causes the given input-output transformation, and thus finds a
second message which produces the same hash value.  This problem is not
significant when SHS is used as a cryptographic hash function, since it only
reduces the search space by 16% from the full 2^160 possibilities.  However
when SHS is used for encryption, it may be possible to accumulate these quarter
bits, so that after 2560 blocks (50K) of known plaintext, enough bits have been
accumulated to compute the encryption key.  This is because multiple
input/output pairs are available for a given data block, and each one puts more
constraints on the block until eventually you have the actual value can be
determined.

If a hash function is has the properties given above and no such information is
leaked, it can serve to create a strong encryption algorithm, but a serious
weakness in the encryption algorithm is not necessarily a serious weakness in
the hash function.  To date noone has ever demonstrated such a weakness, and
there are a number of similar "what if" arguments which can be used against
most encryption schemes.  For example if it were possible to build a quantum
computer then it could be used to break many of the most popular public-key
encryption schemes in use today.  The reason that these schemes aren't being
abandoned is that it is very unlikely that any computer of this form will be
built, and that if someone does manage it then the implications will be far
more serious than just the breaking of a number of encryption schemes.

Footnote [1]: Most of this analysis was contributed by Stephan Neuhaus,
              <neuhaus@informatik.uni-kl.de>

Footnote [2]: This is discussed further in Ralph Merkle's paper "One Way Hash
              Functions and DES", Crypto '89 Proceedings, Springer-Verlag,
              1989 (volume 435 of the Lecture Notes in Computer Science
              series).


Smart cards and SFS

Due to their sealed, self-contained nature, smart cards provide a reasonable
amount of security against many forms of attack such as the trojan horse
variety with which it is so easy to compromise security on a more
general-purpose computer such as a PC.  However in the last few years a large
amount of expertise in attacking smart-card based systems has been gathered by
people who target pay-TV systems which rely on smart-card based processors for
security.  Pay-TV pirates have been very successful in picking apart supposedly
secure processors in order to clone them or defeat their encryption.
Typically, this might be done by heating the device to 150 C and then
dissolving it in fuming nitric acid.  An alternative is to use sulphuric acid
at 300 C.  By using a special fixture which produces a small jet via capillary
action, the part of the package over the chip die can be removed.

Once the packaging has been removed, only the bond wires are left to hold the
leads to the die so the IC will fall apart, leaving the die available for
analysis.  If done correctly, this disassembly of the device will allow the
part to remain functional.  One generation of the Sky TV smart cards were
reverse-engineered using this technique to get to the chips and a scanning
electron microscope to read out the contents of the individual ROM cells.

Another commonly-used technique is to selectively erase the security fuses used
to protect some types of secure microprocessor, allowing the contents of the
ROM to be read out and disassembled.  For example, the security bits in the
8751 microcontroller aren't located anywhere near the EPROM in the device, and
can be selectively erased without affecting the main memory contents, which are
now unprotected and can be read out.  Another way to bypass the security bits
in this device is to manipulate the voltages and control signals fed to it,
which causes it to become confused about whether it should be accessing the
internal, protected memory, or external, unprotected memory.  By using code
located in the external memory it is possible to access the contents of the
internal memory.  This technique has been used to recover the memory contents
of an 87C51 on a smart card with all security features enabled.

Sometimes it is possible to find an address on the card which contains an
opcode corresponding to a jump to code stored in external memory, so that by
changing the external program, it is possible to sieze control of the card
(this method was used to hack the Futuretron video scrambling system).  It is
occasionally possible to gradually erase the security fuse in one-time
programmable and EPROM-programmable secure processors by slowly increasing or
decreasing the supply voltage fed to the chip.  Although this generally erases
or damages the data on the chip as well, it is possible to recover the contents
by combining information read from several chips.

A much more sophisticated way to reverse-engineer chips uses thermal neutron
imaging to perform a non-invasive analysis of the device.  Intense pulsed
neutron source (IPNS) thermal neutrons can be used to image crystal structures
of many angstroms, and is a non-ionizing form of radiation which can penetrate
several meters of material.  By setting the pulses to pass through cold crystal
structures, coherent waves of neutrons can be generated and various layers of a
chip can be focused on and analyzed by moving either the chip or the detector.
Although the electrical nature of the structure such as the values of any
stored charges can't be seen this way, it is possible to duplicate the
structure with a non-invasive method and then experiment with the voltages in
the duplicate.

This method of analysis requires vast resources, but there are many countries
with the necessary nuclear reactors, and creating thermal coherent neutron
waves isn't excessively difficult.  The method is also very difficult to
protect against.

Attacks of this sort, which involve physical analysis of the card contents and
which rely for their success on the ability to read out the internal circuitry
and memory contents of the card, apply only to cards using ROM or EPROM
technology, in which (typically) a memory cell is implemented as a fuse which
is blown when the card is programmed.  The blown and non-blown fuses can then
be read out using one of the techniques described above, and the memory
contents reconstructed.

In contrast, the cards used with SFS use EEPROM memory cells rather than EPROM
ones.  In an EEPROM, data is stored as the presence or abscence of an
electrical charge, which means that the contents of a memory cell cannot be
determined by a physical examination as they can for a conventional ROM or
EPROM cell.  An attack therefore requires disassembling the package and probing
the contents of each memory cell electrically, which requires significantly
more resources than an examination under an electron microscope.

As an additional security precaution, SFS encrypts all critical data stored on
the card in the same way that it encrypts data stored in the SFS volume header.

In normal operation, each SFS disk volume is encrypted with a unique disk key
which is completely unrelated to the the user passphrase, or key.  When the
user key is entered, it is used to decrypt the disk key, and the disk key is
then used to decrypt the encrypted volume.  There is no correlation between the
user key and the disk key, so that revealing the disk key does not reveal the
user key.  This access mechanism looks as follows:

 + User  - - - +            + Encrypted volume  - - - - - - - - - - - - +

 | +--------+  |  decrypt   | +--------+   decrypt    +--------------+  |
   |User Key|   ----------->  |Disk Key| -----------> |Encrypted Data|
 | +--------+  |            | +--------+              +--------------+  |

 + - - - - - - +            + - - - - - - - - - - - - - - - - - - - - - +

When used with a smart card, the user key is instead used to decrypt a key
stored in the smart card which is in turn used to decrypt the disk key:

 + User  - - - +            + Smart card  +

 | +--------+  |  decrypt   | +--------+  |  decrypt
   |User Key|   ----------->  |Card Key|   ----------+
 | +--------+  |            | +--------+  |          |
                                                     |
 + - - - - - - +            + - - - - - - +          |
                                                     |
                     +-------------------------------+
                     |
                     |      + Encrypted volume  - - - - - - - - - - - - +
                     |
                     |      | +--------+   decrypt    +--------------+  |
                     +----->  |Disk Key| -----------> |Encrypted Data|
                            | +--------+              +--------------+  |

                            + - - - - - - - - - - - - - - - - - - - - - +

Since the password is no longer used to directly decrypt the disk key to access
the encrypted volume, knowledge of the user password or key, or any attack
based on the user password or key is of no use unless the attacker is also in
posession of the smart card containing the card key.  Since a single card key
can be used to decrypt multiple disk keys, it is possible for one card to
control access to multiple encrypted volumes.  Finally, since each card key can
also contain extra information such as access rights to an encrypted volume, it
is possible for a central authority to issue to cards which control the type of
access allowed for the volume, such as cards which only grant read access to
their holders.

In order to convert a volume from direct user access to smart-card access, it
is necessary to first initialise the card for use with SFS by generating a
random card key and encrypting it with the user key, and then to bind the card
to a disk volume by encrypting the disk key with the card key.  The first,
initialization step is performed as follows:

    Generate a random card key
    Encrypt the card key with the user key for the card
    Write the encrypted card key to the card

                    Card key
                       |
                       v
      User key    +---------+
        for   --> | Encrypt |
       card       +---------+
                       |
                       v
                   Smart card

To recover the card key, the user key for the card is used to decrypt the key
stored on the card.

The second, card binding step is performed as follows:

    Decrypt the disk key with the user key for the disk
    Decrypt the card key with the user key for the card
    Encrypt the disk key with the card key
    Write the encrypted disk key to the disk

                   Smart card                    Disk volume
                       |                             |
                       v                             v
      User key    +---------+       User key    +---------+
        for   --> | Decrypt |         for   --> | Decrypt |
       card       +---------+        disk       +---------+
                       |                             |
                    Card key                      Disk key
                       |                             |
                       |               +-------------+
                       |               |
                       |               v
                       |          +---------+
                       +--------> | Encrypt |
                                  +---------+
                                       |
                                       v
                                   Disk volume

To recover the card key, the user key for the card is used to decrypt the key
stored on the card.  To recover the disk key, the decrypted card key obtained
in the previous step is used to decrypt the key stored on the disk.

To access a volume using the smart card, the following process is used:

    Decrypt the card key with the user key for the card
    Decrypt the disk key with the card key
    Access the volume using the disk key

                   Smart card
                       |
                       v
      User key    +---------+
        for   --> | Decrypt |
       card       +---------+
                       |
                    Card key      Disk volume
                       |               |
                       |               v
                       |          +---------+
                       +--------> | Encrypt |
                                  +---------+
                                       |
                                       v
                                   Disk key

Access to the disk volume is now possible only to someone in posession of the
smart card containing the card key.  Since multiple disk keys can be encrypted
with the same card key, one card can control access to a number of disk
volumes.  Since the card key and disk key are completely unrelated, acquiring
the smart card containing the encrypted card key without the user key needed to
decrypt it does not provide an attacker with access to the disk volume.


Deletion of SFS Volumes

Truly deleting data from magnetic media is very difficult.  The problem lies in
the fact that when data is written to the medium, the write head sets the
polarity of most, but not all, of the magnetic domains.  This is partially due
to the inability of the writing device to write in exactly the same location
each time, and partially due to the variations in media sensitivity and field
strength over time and among devices.

In general terms, when a one is written to disk, the media records a one, and
when a zero is written, the media records a zero.  However the actual effect is
closer to obtaining something like 0.95 when a zero is overwritten with a one,
and 1.05 when a one is overwritten with a one.  Normal disk circuitry is set up
so that both these values are read as ones, but using specialized circuitry it
is possible to work out what previous `layers' contained (in fact on some
systems it may be possible to recover previous data with a simple software
modification to the hardware control code.  Some drives are capable of
performing "offset reads" to the side of the normal track centres (for example
using SCSI diagnostic commands), so that if the original data was written to
one side, the overwritten data can be recovered by reading from the other side
and applying error-recovery techniques).

This problem is further complicated by the fact that the heads might not pass
exactly over the same track position when data is rewritten, leaving a trace of
the old data still intact.  Current-generation drives reduce this problem
somewhat as track and linear densities have now become so high that the
traditional optical methods of extracting information from the disk platters
has become much more difficult, and in some cases impossible, as the linear bit
cell is below the optical diffraction limit for visible light.  While some data
patterns can still be discerned, recognizing others would be limited to some
subset of patterns.  However the recovery of at least one or two layers of
overwritten data isn't too hard to perform by replacing the drive electronics
(but not the read/write head), reading off the "obvious" signal, computing what
it would look like if written to blank media, subtracting that from what was
actually read, and recovering the overwritten data from the difference, a
technique which has occasionally been used by hard drive manufacturers to
recover accidentally overwritten or lost data.

Despite this small respite, when all the above factors are combined it turns
out that each track on a piece of media contains an image of everything ever
written to it, but that the contribution from each `layer' gets progressively
smaller the further back it was made.  Using even more advanced techniques than
those mentioned above, like low energy electron scattering, ferrofluid with
optical tracers, or related methods, followed by the same signal-processing
technology which is used to clean up satellite images, low-level recorded
speech, and the like, it is possible to recover layers of previous data with a
remarkable degree of accuracy, to a level limited only by the sensitivity of
the equipment and the amount of expertise of the organisation attempting the
recovery.  Intelligence organisations have a *lot* of expertise in this field.

The general concept behind the overwriting scheme used by SFS is to flip each
magnetic domain on the disk back and forth as much as possible (this is the
basic idea behind degaussing) without writing the same pattern twice in a row.
If the data was encoded directly, we could simply choose the desired overwrite
pattern of ones and zeroes and write it repeatedly.  However, disks always use
an NRZI encoding scheme in which a 1 bit signifies an inversion, making the
desired pattern a series of one bits.  This leads to a further complication as
all disks use some form of run-length limited (RLL) encoding, so that the
adjacent ones won't be written.  This encoding is used to ensure that
transitions aren't placed too closely together, or too far apart, which would
mean the drive would lose track of where it was in the data.

The parameter to be aware of here is the separation of 1 bits.  Floppies (which
are a more primitive form of the technology used in hard disks) like to keep
the 1 bits 4us apart.  However they can't be kept too far apart or the read
clock loses synchronisation.  This "too far" figure depends a lot on the
technology in the drive, it doesn't depend on the magnetic media much (unlike
the "too close" figure, which depends a lot on the media involved).  The first
single-density encoding wrote one user data bit, then one "1" clock bit, taking
a total of 8us.  This was called FM, since a 1 bit was encoded as two
transitions (1 wavelength) per 8 us, while a 0 bit was encoded as one
transition (1/2 wavelength).

Then it was discovered that it was possible to have two 0 bits between adjacent
1s.  The resulting encoding of 4 bits into 5 was called group code recording
(GCR) which was (0,2) RLL.  This allowed 4 user data bits to be written in
5 * 4us = 20 us, for an effective time per user bit of 5 us, which was a big
improvement over 8 us.

But GCR encoding was somewhat complex.  A different approach was taken in
modified FM (MFM), which suppressed the 1 clock bit except between adjacent
0's.  Thus, 0000 turned into 0(1)0(1)0(1)0 (where the ()s are the inserted
clock bits), 1111 turned into 1(0)1(0)1(0)1, and 1010 turned into
1(0)0(0)1(0)0.  The maximum time between 1 bits was now three 0 bits.  However,
there was at least one 0 bit, so it became possible to clock the bits at
2us/bit and not have more than one 1 bit every 4us.  This achieved one user bit
per 4 us, a result which was better than GCR and obtainable with simpler
circuitry.  As can be seen, the effective data rate depends on the bit rate
(which has been 4 us, 4 us and 2 us in these examples) and the encoding rate, a
measure of the encoding efficiency.  The encoding rates have been 1/2, 4/5 and
1/2.

There is also a (2,7) RLL code with rate 1/2, meaning that 1 user bit maps to 2
encoded bits (although the mapping involves multi-bit groups and is somewhat
complex), but there are always at least two 0 bits between 1 bits, so 1 bits
happen at most every 3 bit times.  This allows the clock to be reduced to
1.3333 us (this 2/1.33333 = 6/4 = 3/2 is the source of the added capacity gain
of RLL hard drive controllers over equivalent MFM controllers).  The time per
user bit is now 2.6666 = 2 2/3 us.

However, the faster clock is finicky.  It is also possible to use a (1,7) RLL
encoding.  Since this is (1,x), the clock rate is 2 us/bit, but the encoding
efficiency improves from 1/2 to 2/3.  This allows 2 effective user bits per 6
us, or 3 us per user bit.  For hard drives, it is easier to increase the clock
rate by an extra 9/8 than to fiddle with a clock 50% faster, so this is very
popular with more recent disk drives.

To erase magnetic media, we need to overwrite it many times with alternating
patterns in order to expose it to a magnetic field oscillating fast enough that
it does the desired flipping of the magnetic domains in a reasonable amount of
time.  Unfortunately, there is a complication in that we need to saturate the
disk surface to the greatest depth possible, and very high frequency signals
only "scratch the surface" of the magnetic medium.  Disk drive manufacturers,
in trying to achieve ever-higher densities, use the highest possible
frequencies, whereas we really require the lowest frequency a disk drive can
produce.  Even this is still rather high.  In fact, a full erasure of the disk
is generally impossible to perform, as can be seen by considering the manner in
which servo information is encoded.

All disk drives need what is called "servo" information indicating where the
head is relative to the centre of the track it is trying to read.  Older drives
used one of two techniques, the first of which was the "separate servo"
technique in which one surface of the stack of platters was reserved for servo
information used to control the stack of arms for the other platters.  This
gave a high update rate for the servo information, but didn't allow for any
adjustment for differential expansion of the platters or arms, or tilting of
the stack of arms.  This multiplexing in space of the servo information was
limited by the accuracy to which two heads could be aligned in the same stack.

The other technique was the "embedded servo" technique in which the servo
information was placed on the same platter as the data, between the data blocks
in the header area.  This meant that the same head was used to read the servo
information and disk data, but led to gaps while reading and writing the data.
This technique multiplexes the servo information in time.

A newer technique is called "servo-under" (also known as "dedicated servo"),
and writes the servo information "under" the data, at a lower frequency.  The
normal data write frequencies are high enough, and the write signal weak
enough, that it never penetrates to the bottom of the the magnetic media and so
the servo is unaffected by writing to the disk (in fact, even if the head was
writing DC it probably wouldn't be able to generate a strong enough signal to
penetrate to the servo data).  A couple of filters separate the servo
information from the data, so that the servo information can be accessed from
the same head and at the same time as the data is read.  This technique is very
convenient, but shows that the media can never be completely saturated, since
doing this would erase the servo information, making the disk useless.

However, what we can do is to use the lowest frequency possible for overwrites,
to penetrate as deeply as possible into the recording medium.  To do this, we
must determine what decoded data to write to produce a low-frequency encoded
signal.

The three most common RLL codes are (1,3) RLL (usually known as MFM), (2,7) RLL
(the original "RLL" format), and (1,7) RLL (which is popular in newer drives).
These three cover the vast majority of magnetic disk drives.  However, each of
these has several possible variants.  With MFM, only one is used with any
frequency, but the newest (1,7) RLL code has at least half a dozen variants in
use.  (1,3) RLL and (2,7) RLL have coding rates of 1/2, meaning that one
"useful" decoded bit requires two encoded bits.  (1,7) RLL has a rate of 2/3,
so two decoded bits correspond to three encoded bits.

For MFM, there are at most 4 bit times between transitions, so the lowest
frequency possible is "4T", attained by the repeating decoded data patterns
1010 and 0101.  These have a 1 bit every other "data" bit, and the intervening
"clock" bits are all 0.  To complete the set, we would like patterns with every
other clock bit set to 1 and all others set to 0, but these are not possible in
the MFM encoding (such "violations" are used to generate special marks on the
disk to identify sector boundaries).

The best we can do is 3 bit times between transitions, a 3T frequency, which is
generated by repeating the decoded patterns 100100, 010010 and 001001.  We
should use several of rounds of these patterns, as MFM drives are the oldest,
lowest-density drives around (this is especially true for the very-low-density
floppy drives).  As such, they are the easiest to recover data from with modern
equipment and we need to take the most care with them.

For (1,7) RLL, although there can be as many as 8 bit times between
transitions, the lowest sustained frequency we can have is 6T, 6 bit times
between transitions.  This is a desirable property from the point of view of
the clock-recovery circuitry, and all (1,7) RLL codes seem to have this
property.  We now need to find a way to write the desired pattern without
knowing the particular (1,7) RLL code used.  We can do this by looking at the
way the drive's error-correction system works.  The error-correction is applied
to the decoded data, even though errors generally occur in the encoded data.
In order to make this work well, the data encoding should have limited error
amplification, so that an erroneous encoded bit should affect only a small,
finite number of decoded bits.

Decoded bits therefore depend only on nearby encoded bits, so that a repeating
pattern of encoded bits will correspond to a repeating pattern of decoded bits.
The repeating pattern of encoded bits is 6 bits long.  Since the rate of the
code is 2/3, this corresponds to a repeating pattern of 4 decoded bits.  There
are only 16 possibilities for this pattern, making it feasible to write all of
them during the erase process.  So to achieve good overwriting of (1,7) RLL
disks, we write the patterns 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111,
1000, 1001, 1010, 1011, 1100, 1101, 1110, and 1111.  These patterns also
conveniently cover two of the ones needed for MFM overwrites, although we
should add a few more iterations of the MFM-specific patterns for the reasons
given above.

For (2,7) RLL, we again have 6T as the lowest systainable frequency, which
maps, at an encoding rate of 1/2, to some repeating pattern of 3 decoded bits,
so we require all 8 possible 3-bit repeating patterns.  The all-zeros and
all-ones patterns overlap with the (1,7) RLL patterns, leaving six others:

  001001001001001001001001
     2   4   9   2   4   9

in binary or 0x24 0x92 0x49, 0x92 0x49 0x24 and 0x49 0x24 0x92 in hex, and

  011011011011011011011011
     6   D   B   6   D   B

in binary or 0x6D 0xB6 0xDB, 0xB6 0xDB 0x6D and 0xDB 0x6D 0xB6 in hex.  The
first three are the same as the MFM patterns, so we need only three extra
patterns to cover (2,7) RLL drives.

Although (1,7) is more popular in recent drives, some hard drives do still use
(2,7) RLL.  These three patterns also cover any problems with endianness
issues, which weren't a concern in the previous two cases, but would be in this
case (actually, thanks to the strong influence of IBM mainframe drives,
everything seems to be uniformly big-endian within bytes, with the most
significant bit being written to the disk first).

For the latest crop of high-density drives which use methods like Partial-
Response Maximum-Likelihood (PRML) methods which may be roughly equated to the
trellis encoding done by V.32 modems in that it is effective but
computationally expensive, all we can do is write a variety of random patterns,
because the processing inside the drive is too complex to second-guess.
Fortunately, these drives push the limits of the magnetic media much more than
the old MFM drives ever did by encoding data with much smaller magnetic
domains, closer to the physical capacity of the magnetic media.  If these
drives require sophisticated signal processing just to read the most recently
written data, reading overwritten layers is also correspondingly more
difficult.  A good scrubbing with random data will do about as well as can be
expected.

We now have a set of 22 overwrite patterns which will erase everything,
regardless of the raw encoding.  The basic disk eraser can be improved slightly
by adding random passes before, after, and during the erase process, and by
performing the deterministic passes in random order to make it more difficult
to guess which of the known data passes were made at which point.

To deal with all this in one overwrite pattern, SFS uses the following sequence
of 35 consecutive writes to erase data:

   1.  Random
   2.  Random
   3.  Random
   4.  Random
   5.  01010101 01010101 01010101  0x55           (1,7) RLL             MFM
   6.  10101010 10101010 10101010  0xAA           (1,7) RLL             MFM
   7.  10010010 01001001 00100100  0x92 0x49 0x24            (2,7) RLL  MFM
   8.  01001001 00100100 10010010  0x49 0x24 0x92            (2,7) RLL  MFM
   9.  00100100 10010010 01001001  0x24 0x92 0x49            (2,7) RLL  MFM
  10.  00000000 00000000 00000000  0x00           (1,7) RLL  (2,7) RLL
  11.  00010001 00010001 00010001  0x11           (1,7) RLL
  12.  00100010 00100010 00100010  0x22           (1,7) RLL
  13.  00110011 00110011 00110011  0x33           (1,7) RLL
  14.  01000100 01000100 01000100  0x44           (1,7) RLL
  15.  01010101 01010101 01010101  0x55           (1,7) RLL             MFM
  16.  01100110 01100110 01100110  0x66           (1,7) RLL
  17.  01110111 01110111 01110111  0x77           (1,7) RLL
  18.  10001000 10001000 10001000  0x88           (1,7) RLL
  19.  10011001 10011001 10011001  0x99           (1,7) RLL
  20.  10101010 10101010 10101010  0xAA           (1,7) RLL             MFM
  21.  10111011 10111011 10111011  0xBB           (1,7) RLL
  22.  11001100 11001100 11001100  0xCC           (1,7) RLL
  23.  11011101 11011101 11011101  0xDD           (1,7) RLL
  24.  11101110 11101110 11101110  0xEE           (1,7) RLL
  25.  11111111 11111111 11111111  0xFF           (1,7) RLL  (2,7) RLL
  26.  10010010 01001001 00100100  0x92 0x49 0x24            (2,7) RLL  MFM
  27.  01001001 00100100 10010010  0x49 0x24 0x92            (2,7) RLL  MFM
  28.  00100100 10010010 01001001  0x24 0x92 0x49            (2,7) RLL  MFM
  29.  01101101 10110110 11011011  0x6D 0xB6 0xDB            (2,7) RLL
  30.  10110110 11011011 01101101  0xB6 0xDB 0x6D            (2,7) RLL
  31.  11011011 01101101 10110110  0xDB 0x6D 0xB6            (2,7) RLL
  32.  Random
  33.  Random
  34.  Random
  35.  Random

The MFM-specific patterns are repeated twice because MFM drives are generally
the lowest density, and are thus particularly easy to examine.  The
deterministic patterns between the random writes are permuted before the write
is performed, to make it more difficult for an opponent to use knowledge of the
erasure data written to attempt to recover overwritten data (in fact we need to
use a cryptographically strong random number generator to perform the
permutations to avoid an opponent who can read the last overwrite pass being
able to predict the previous passes and "echo cancel" passes by subtracting the
known overwrite data).

If the device being written to supports cacheing or buffering of data, SFS will
additionally attempt to disable the buffering to ensure physical disk writes
are performed (for example by setting the Force Unit Access bit during SCSI-2
Group 1 write commands).

There is a commonly-held belief that there is a US government standard for
declassifying magnetic media which simply involves overwriting data on it three
times.  There are in fact a number of standards[1] which contain simple phrases
such as "Magnetic disks, drums, and other similar rigid storage devices shall
be sanitized by overwriting all storage locations with any character, then the
complement of the character (e.g., binary ones and binary zeros) alternately a
minimum of three times".  However this simple description is usually
reinterpreted by the appropriate government agencies to a level which often
makes physical destruction of the media and its replacement with new media
preferable to attempting any declassification by overwriting the data (the
(classified) standards for truly declassifying magnetic media probably involve
concentrated acid, furnaces, belt sanders, or any combination of the above[2]).
One unclassified standard which contains information on data erasure is ANSI
X9.17-1985 "Financial Key Management (Wholesale)", Appendix E, "Erasing
(Zeroizing) Recording Media Used for Storage of Keying Material".  Section E.5,
"Disk Pack" contains three possible alternatives: "(a) Apply an emery wheel or
sander to the recording surface of an inoperative disk.  Ensure that the entire
surface is completely removed prior to disposal.  (b) The resin binder and
ferric oxide surfece can be completely removed/stripped (chemically destroyed)
from the disk with concentrated hydrochloric acid (55-58%).  (c)  Melting".
This applies only to non-classified data, the sanitizing of media containing
classified information is generally handled in a far more paranoid manner.

The use of such extreme measures is necessary not only because data recovery
from the actual tracks itself may (with some effort) be possible, but because
of factors like intelligent drives which keep so-called "alternative cylinders"
on a disk free to allow dynamic re-allocation of data to one of these tracks in
case a disk block develops errors.  The original block is no longer accessible
through any sort of normal access mechanism, and the data on it can't be
destroyed without going through some unusual contortions which tend to be
highly hardware-dependant.  Other complications include the use of journaling
filesystems which keep track of previous generations of data, and disk
compression software or hardware which will compress a simple repeated
overwrite pattern to nothing and leave the original data intact on the disk[3].
Therefore if "overwriting all storage locations" is interpreted to mean
"exposing the entire reading surface to a magnetic field having a strength at
the recording surface greater than the field intensity at which it was
recorded", the method does have merit.  Unfortunately it is virtually
impossible to get at all storage locations, and simple-minded methods such as
trying to write patterns to the storage allocated to a file in order to erase
it don't even come close to this target.  The overwrite method used by SFS does
come reasonably close by attempting to create a fluctuating magnetic field over
all physically accessible sectors which make up a disk volume, creating the
same effect as a degaussing tool used to erase magnetic fields.

One possibility for lessening the impact of encrypted data being retained on a
disk long after it should have vanished is to frequently change an encryption
parameter so that the encrypted data also changes when the encryption parameter
is changed.  For the case of SFS disk keys, all this would require is that at
certain intervals (for example each time the volume is mounted) the disk key is
re-encrypted with a different random initialization vector and written back to
the disk so that the key never has time to get "burned in" to the media.
However this still leaves the problem of the last-written key being present,
and is probably not worth the effort involved in constantly updating the key.

Another consideration which needs to be taken into account when trying to erase
data through software is that drives conforming to some of the higher-level
protocols such as the various SCSI standards are relatively free to interpret
commands sent to them in whichever way they choose (as long as they still
conform to the SCSI specification).  Thus some drives, if sent a FORMAT UNIT
command may return immediately without performing any action, may simply
perform a read test on the entire disk (the most common option), or may
actually write data to the disk[4][5].  This is rather common among newer
drives which can't directly be low-level formatted, unlike older ST-412 and
ST-506 MFM or RLL drives.  For example trying to format an IDE drive generally
has little effect - a low-level format generally isn't possible, and the
standard DOS `format' command simply writes a boot record, FAT, and root
directory, performs a quick scan for bad sectors, and exits.

Therefore if the data is very sensitive and is stored on floppy disk, it can
best be destroyed by removing the media from the disk liner and burning it.
Disks are relatively cheap, and can easily be replaced.  Permanently destroying
data on fixed disks is less simple, but the multiple-overwrite option used by
SFS at least makes it quite challenging (and expensive) to recover any
information.

Footnote [1]: Among others there is the Department of Defense standard DoD
              5200.28-M, Army standard AR 380-19, Navy standards OPNAVINST
              5510.1H and NAVSO P-5239-26, Air Force standard AFSSI-5020, and
              Department of Energy standard DOE 5637.1.

Footnote [2]: The UK Ministry of Defence grinds down disk platters and then
              supposedly stores the (still-classified) dust for a decade or
              more.  Rumours that they remove programmers brains for storage in
              order to erase the classified information they contain are
              probably unfounded.

Footnote [3]: From a posting to the usenet alt.security newsgroup on 1 August
              1994, message-ID <31c75s$pa8@search01.news.aol.com>: "I got fired
              from my job and was told to clean my desk, so I immediately went
              to my office and ran Norton WipeDisk on my hard drive, which
              contained much of the work I had done and also my contact list,
              letters, games, and so on.  Unfortunately, I had DoubleSpaced it
              and the files were easily recovered".

Footnote [4]: Again it may be possible to bypass this using highly hardware-
              specific methods.  For example Quantum SCSI drives manufactured a
              few years ago could be forced to write data to disk during a
              format by changing the sector filler byte before the format
              command was issued.

Footnote [5]: The SCSI-2 standard includes an initialization pattern (IP)
              option for the FORMAT UNIT command (Section 8.2.1.2), however it
              is not clear how well this is supported by existing drives.


Safeguarding Cryptographic Keys

A threshold scheme divides a message (in this case the key to be protected)
into `n' pieces, or shares, so that any `m' of these shares can be used to
reconstruct the key, but `m-1' or less cannot reconstruct it.  This is called
an (m,n) threshold scheme.

A simple all-or-nothing scheme would break a key into `n' parts such that the
key could be recovered by taking the exclusive-or of these parts.  However this
method allows no margin of error, since even a single missing share will
destroy the ability to recreate the key.  This method allows for a limited form
of key safeguarding, but is not a true threshold scheme.

SFS uses a true threshold scheme, namely the one presented in "How to Share a
Secret" by Adi Shamir, Communications of the ACM, Vol.22, No.11 (November
1979), p.612.  This involves chosing a prime `p' which is larger than the
number of shares required, and an arbitrary polynomial of degree `m-1', where
`m' is the number of shares necessary to reconstruct the secret.  To distribute
the secret data, we generate a polynomial:

    ax^(m-1) + bx^(m-2) ... + cx + M (mod p)

where `a' ... `c' are random secret coefficients which are discarded once the
data has been distributed, `p' is a prime number larger than any of the
coefficients, and `M' is the secret to be distributed.  For example if we
wanted to create a (3,n) threshold scheme in which three shares out of the
total number would be necessary to reconstruct the secret data, we would
generate the quadratic polynomial:

    ax^2 + bx + M (mod p)

The shares themselves are obtained by evaluating the polynomial at `n'
different points, where `n' is the total number of shares distributed.  In this
case for our polynomial f() we evaluate it at x = 1, x = 2, ... x = n, and
distribute the resulting f( 1 ), f( 2 ), ... f( n ) values as the share data.
Since the polynomial has `m' unknown coefficients a, b, ... c and M, any `m'
shares can be used to create `m' equations, after which linear algebra can be
used to solve for M.  `m-1' shares can't recreate the secret.  More than `m'
shares are redundant.

For example, suppose we wish to create a (3,5) threshold scheme in which any 3
of a total of 5 shares are sufficient to reconstruct the secret M.  Assuming
M = 5 and taking two random coefficients a = 4 and b = 6 and a prime p = 13, we
have:

    f(x) = 4x^2 + 6x + 5 (mod 13)

Evaluating this at x = 1...5 gives:

    f(1) = 4 + 6 + 5 (mod 13)
         = 2
    f(2) = 16 + 12 + 5 (mod 13)
         = 7
    f(3) = 36 + 18 + 5 (mod 13)
         = 7
    f(4) = 64 + 24 + 5 (mod 13)
         = 2
    f(5) = 100 + 30 + 5 (mod 13)
         = 5

To reconstruct `M' from three of these shares (for example share 1, 3, and 5),
we need to solve the set of linear equations:

    2 = a.1^2 + b.1 + M (mod 13)
    7 = a.3^2 + b.3 + M (mod 13)
    5 = a.5^2 + b.5 + M (mod 13).

We can do this using Lagrange interpolation to recover the values a = 4, b = 6,
and M = 13, giving us the original secret.

A single share therefore consists of an X coordinate 1, 2, ... n, and however
many Y coordinates f( 1 ), f( 2 ), ... f( n ) are needed.  The total set of
shares is a set of X coordinates X[] and a corresponding number of Y
coordinates Y[].

To reconstruct the secret, we first calculate the coefficients C[]:

    for( i = 0; i < m; i++ )
        C[ i ] = 1;
        for( j = 0; j < m; j++ )
            if( i != j )
                C[ i ] *= X[ j ] / ( X[ i ] - X[ j ] );

Once we have the coefficients, we can reconstruct the secret from the rest of
the share, namely the Y coordinates:

    secret = 0;
    for( i = 0; i < m; i++ )
        secret += C[ i ] * Y[ i ];

To make the secret-sharing task easier, we use a finite field for all our work.
An exact explanation of the theory involved is beyond the scope of this
document, anyone interested in more background details should refer to the
references in the "Recommended Reading" section below.  It suffices to note
that the secret sharing scheme used by SFS works in GF( 2^n ), in which
addition, subtraction, multiplication, and division are all well defined.
There is an additive identity 0, a multiplicative identity 1, every nonzero
number has a unique inverse, and the commutative, associative, and distributive
laws all hold.  Galois fields are very convenient to work with since they keep
the numbers involved to a finite size, and there are no rounding errors in
division.

All arithmetic is done modulo an irreducible polynomial of degree `n'.  The
characteristics of the Galois fields used by SFS are as follows:

     n      Max.no of shares    Generator polynomial

     4            15            x^4 + x + 1
     5            31            x^5 + x^2 + 1
     6            63            x^6 + x + 1
     7           127            x^7 + x + 1
     8           255            x^8 + x^4 + x^3 + x^2 + 1
     9           511            x^9 + x^4 + 1
    10          1023            x^10 + x^3 + 1
    11          2047            x^11 + x^2 + 1
    12          4095            x^12 + x^6 + x^4 + x + 1
    13          8191            x^13 + x^4 + x^3 + x + 1
    14         16383            x^14 + x^5 + x^3 + x + 1
    15         32767            x^15 + x + 1
    16         65535            x^16 + x^5 + x^3 + x^2 + 1

Although SFS could use any of GF( 2^4 ) ... GF( 2^16 ), the current
implementation is restricted to GF( 2^8 ) to allow data to be processed in
byte-sized quantities.

The use of a threshold scheme can be extended to allow shareholders to be
replaced or added at a later point using the techniques given in David Chaum's
paper "How to Keep a Secret Alive", presented during Crypto'84 and appearing in
volume 196 of the Lecture Notes in Computer Science published by
Springer-Verlag (ISBN 3-540-15658-5 and 0-387-15658-5).  The replacement of
a shareholder requires that a shareholder break his share into subshares and
distribute the subshares to the other shareholders, allowing them to recreate
his share if it is lost or destroyed.  The creation of a new shareholder
requires the generation of additional shares which are again broken into
subshares and distributed to shareholders.  In either case a quorum of
shareholders can use their subshares to vote on whether a shareholder should be
replaced, or a new shareholder created.  SFS does not currently support this
extended key safeguarding due to the complexity involved in its use.
