Encryption Background

From CSEP590TU
Jump to: navigation, search

Intro

Before we can begin to define terms, and speak about the history of encryption, we must first define what encryption is. Encryption is the transformation of data into a form unreadable by anyone without a secret decryption key. Its purpose is to ensure privacy by keeping the information hidden from anyone for whom it was not intended, including those who can see the encrypted data. Encryption may be used to make stored data private (e.g., data that is stored on a potentially vulnerable hard disk), or to allow a nonsecure communications channel to serve as a private communications channel. Encryption is sometimes described as the process of converting plain text into cipher text.

The above definition describes the entire process of encryption. There are several components that make up this process, which we will define. An Encryption Scheme is a primitive that specifies an encryption algorithm, which tells the sender how to process the plaintext using the key, thereby producing the ciphertext that is actually transmitted. An encryption scheme also specifes a decryption algorithm, which tells the receiver how to retrieve the original plaintext from the transmission while possibly performing some verification, too. Finally, there is a key-generation algorithm, which produces a key that the parties need to share.

Private Key Cryptography

The setting for private-key cryptography is also known as the “symmetric” setting due to the symmetry of keys held by parties. Namely, the encryptor and decryptor use the same key.

Symmetric Encryption

A Symmetric (shared-key) encryption scheme SE = (K; E;D) consists of three algorithms, as follows:

  • The randomized key generation algorithm K returns a string K. We let Keys(SE) denote the set of all strings that have non-zero probability of being output by K. The members of this set are called keys.
  • The encryption algorithm E takes a key K in Keys(SE) and a plaintext M to return a ciphertext C. This algorithm might be randomized or stateful.
  • The deterministic decryption algorithm D takes a key K in Keys(SE) and a ciphertext C to return some M.
  • We require that for any key K in Keys(SE) and any message M, if EK(M) returns a ciphertext C then DK(C) = M.

Once in possession of a shared key, the sender can run the encryption algorithm with key K and input message M to get back a string we call the ciphertext. The latter is then transmitted to the receiver.

The receiver, upon receiving a ciphertext C, will run the decryption algorithm with the same key used to create the ciphertext, namely compute DK(C). If C was an output of EK on input M then DK(C) will equal M, enabling the receiver to recover the ciphertext. The decryption algorithm might however, fail to decrypt and return a special symbol to indicate that it deems the ciphertext invalid.

Public Key Cryptography

The setting of public-key cryptography is also called the “asymmetric" setting due to the asymmetry in key information held by the parties. Namely one party has a secret key while another has the public key that matches this secret key. This is in contrast to the symmetry in the private key setting, where both parties had the same key.

An asymmetric encryption scheme is just like a symmetric encryption scheme except for an asymmetry in the key structure. The key pk used to encrypt is different from the key sk used to decrypt. Furthermore pk is public, known to the sender and also to possible adversaries. So while only a receiver in possession of the secret key can decrypt, anyone in possession of the corresponding public key can encrypt data to send to this one receiver.

Assymetric Encryption

An Asymmetric encryption scheme AE = (K; E;D) consists of three algorithms, as follows:

  • The randomized key generation algorithm K returns a pair (pk; sk) of keys, the public key and matching secret key, respectively.
  • The encryption algorithm E takes the public key pk and a plaintext M to return a ciphertext C. The algorithm may be randomized, but not stateful.
  • The deterministic decryption algorithm D takes the secret key sk and a ciphertext C to return some M.

We require that for any key-pair (pk; sk) that might be output by K and any message M, if C was returned by Epk(M) then Dsk(C) = M.

Block Ciphers

We have yet to define what makes up encryption and decryption algorithms. We define by defining another encryption primitive known as the block cipher. Block ciphers are the building blocks of encryption and decryption algorithms that work on fixed size plaintext inputs. A block cipher is a family of functions E: {0,1}k X {0,1}n à {0,1}n. This notation means that E takes two inputs, one being a k-bit string and the other an n-bit string, and returns an n-bit string. The first input is the key. The second might be called the plaintext, and the output might be called a ciphertext. The key-length k and the block-length n are parameters associated to the block cipher. They vary from block cipher to block cipher, as of course does the design of the algorithm itself. Note that for every encryption block cipher E, there is a corresponding decryption block cipher E^-1.

Modes of Operation

There are several ways in which to piece together block ciphers to create encryption and decryption algorithms. These different ways are called modes of operation. We define a few popular modes of operation here, however there are a multitude of others that are used today.

Substitution Cipher

The Substitution cipher does not use a block cipher due to its simplicity, however it was very popular throughout history and was fundamental in the evolution of encryption and encryption schemes. Each plaintext character or group of characters is simply replaced by a corresponding one from a cipher alphabet. The cipher alphabet may be shifted, reversed, or scrambled, in which case it is called a "mixed alphabet" or "deranged alphabet".

One Time Pad (OTP)

Although the classical One-Time-Pad (OTP) encryption scheme does not use a block cipher, it is a perfectly secure scheme and can be thought of as implementing a perfectly secure block cipher in a perfectly secure mode. The OTP encryption scheme SE = (K; E;D) is stateful and deterministic. The key-generation algorithm simply returns a random k-bit string K, where the key-length k is a parameter of the scheme. The encryptor maintains a counter, which is initially zero. The encryption algorithm XORs the message bits with key bits, starting with the key bit indicated by the current counter value. The counter is then incremented by the length of the message. Key bits are not reused, and thus if not enough key bits are available to encrypt a message, the encryption algorithm returns a special symbol. The ciphertext returned includes the value of the counter to enable decryption.

Electronic Code Book (ECB)

Operating in Electronic Code Book (ECB) mode yields a stateless symmetric encryption scheme SE = (K; E;D). It breaks a message M into n-bit blocks M[0]…M[m], where m=|M|/n. It feeds each of those blocks into a block cipher along with the key k to get correspondingly long n-bit ciphertexts C[0]…C[m]. The ciphertext representing the entire message M is C = C[0]…C[m].

Shown below is a conceptual diagram of how ECB mode pieces together block ciphers to create an encryption algorithm. The decryption algorithm is not shown, but is simply the inverse of the encryption algorithm in which the decryption block ciphers receive input C[0]…C[m] and key k, and return the plaintext M = M[0]…M[m].

http://ieng9.ucsd.edu/~jopos/cse_291/ECB.jpg
Figure 1: Electronic Code Book (ECB) mode encryption algorithm.

Cipher Block Chaining (CBC)

Operating in Cipher Block Chaining (CBC) mode with counter IV yields a stateful symmetric encryption scheme, SE = (K; E;D). Each block of plaintext is XORed with the previous ciphertext block before being encrypted. This way, each ciphertext block is dependent on all plaintext blocks up to that point.

Shown below is a conceptual diagram of how CBC mode pieces together block ciphers to create encryption and decryption algorithms.

http://ieng9.ucsd.edu/~jopos/cse_291/CBC_encryption.jpg
Figure 2: CBC mode Encryption Algorithm

http://ieng9.ucsd.edu/~jopos/cse_291/CBC_decryption.jpg
Figure 3: CBC mode Decryption Algorithm

ECB and CBC are both symmetric modes of operation. There are many variants and other modes of operation that are used today. In contrast, there are very few different asymmetric modes of operation in use today due to the popularity and reliability of RSA.

Data Encryption Standard (DES)

The Data Encryption Standard (DES) is the quintessential block cipher. Even though it is now quite old, and on the way out, no discussion of block ciphers can really omit mention of this construction. DES is a remarkably well-engineered algorithm which has had a powerful influence on cryptography. It is in very widespread use, and probably will be for some years to come.

DES has a key-length of k = 56 bits and a block-length of n = 64 bits. It consists of 16 rounds of what is called a "Feistel network." The basic idea behind a Feistel round is to break intermediate texts into a left and right side, which will be encrypted differently, and fed into the opposite side of the block cipher in the next round (i.e. the left output goes into the right input and vice versa). One of the reasons to use this round structure is that it is reversible eventhough the functions which make up the block ciphers are not. This structure is part of the reason why DES is such a strong cipher - without the correct keys and the correct Feistel network, the ciphertext is almost useless.

The only known flaw to DES is its age. The key length was used because at the time it was long enough to make exhaustive key searches and birthday attacks prohibitive. However, with the dramatic increase in processor speeds, this is no longer the case. Other modes have been created to circumvent the key length problem such as Double-DES, Triple-DES, and DESX, which all use a combination of DES ciphers.

Advanced Encryptoin Standard (AES)

The AES has a block length of n = 128 bits, and a key length k that is variable: it may be 128, 192 or 256 bits. So the standard actually specifies three different block ciphers: AES128, AES192, AES256. These three block ciphers are all very similar, so we will stick to describing just one of them, AES128.

AES is a very elegant algorithm which is based on finite fields and abstract algebra. A complete definition is beyond the scope of this paper, however the punchline is that AES solves the key-length and block-length problems of DES in a more efficient and elegant fashion.

RSA

The RSA system is the basis of the most popular public-key cryptography solutions. The acronym stands for Rivest, Shamir, and Adelman, the inventors of the technique. The RSA algorithm is based on the fact that there is no efficient way to factor very large numbers. Deducing an RSA key, therefore, requires an extraordinary amount of computer processing power and time.

The RSA algorithm has become the de facto standard for industrial-strength encryption, especially for data sent over the Internet. It is built into many software products, including Netscape Navigator and Microsoft Internet Explorer. The technology is so powerful that the U.S. government has restricted exporting it to foreign countries.

Encryption concepts and primitives are not only used for data protection, but also for data authentication. We define the two authentication schemes as they are very closely related to encryption schemes both symmetric and asymmetric.

Message Authentication

Message authentication allows one party - the Sender - to send a message to another party – the Receiver - in such a way that if the message is modified en route, then the Receiver will almost certainly detect this. Message authentication is also called “data-origin authentication," since it authenticates the point-of-origin for each message. Message authentication is said to protect the “integrity" of messages, ensuring that each that is received and deemed acceptable is arriving in the same condition that it was sent out - with no bits inserted, missing, or modified.

In this case the Sender and the Receiver share a secret key, K, which they'll use to authenticate their transmissions.

http://ieng9.ucsd.edu/~jopos/cse_291/mac.jpg
Figure 4: A message-authentication scheme. Sender S wants to send a message M to receiver R in such a way that R will be sure that M came from S. They share key K. Adversary A controls the communication channel. Sender S sends an authenticated version of M, Mbar, which adversary A may or may not pass on. On receipt of a message M, receiver R either recovers a message that S really sent, or else R gets an indication that M is inauthentic.

Privacy does not imply authenticity. The reader should note that although the goals may appear similar, they are distinct, and one does not guarantee the other.

Digital Signatures

A digital signature scheme is just like a message authentication scheme except for an asymmetry in the key structure. The key sk used to generate signatures (in this setting the tags are often called signatures) is different from the key pk used to verify signatures. Furthermore pk is public, in the sense that the adversary knows it too. So while only a signer in possession of the secret key can generate signatures, anyone in possession of the corresponding public key can verify the signatures.

The key usage is the “mirror-image" of the key usage in an asymmetric encryption scheme. In a digital signature scheme, the holder of the secret key is a sender, using the secret key to tag its own messages so that the tags can be verified by others. In an asymmetric encryption scheme, the holder of the secret key is a receiver, using the secret key to decrypt ciphertexts sent to it by others.

http://ieng9.ucsd.edu/~jopos/cse_291/digital_signatures.jpg
Figure 5: Digital Signature Scheme.

Attacks

All encryption schemes have been scrutinized and gone under attack of one form or another. A “good” encryption scheme is one that can detect when they have been attacked. Although determining such a thing is a fairly recent idea and is outside the scope of this paper, we discuss some different types of attacks that have proven successful on many encryption schemes in the past, as well as modern ones.

In all cases, the assumption is made that an adversary has access to the definition or the encryption scheme, and can listen to the channel on which ciphertexts are transferred. Making these assumptions may be a bit overzealous, but as we have seen in the past, security through obscurity is not strong enough. These assumptions are the standard assumption made in creating encryption schemes, and lead to more secure and robust encryption schemes. Also, we define an attack to be successful if it can deduce plaintexts, or even better the key.

The attacks below are ordered from weakest to strongest (i.e weak attacks are attacks that do not succeed as often as stronger ones because they know less information).

Known Ciphertext Attack

A known ciphertext attack is a cryptanalytic attack in which the adversary has access to a set of ciphertexts. This is the easiest attack to mount, since it only requires the adversary to listen to an insecure communication channel. Conversly, this attack is the most difficult attack to execute successfully because the adversary has so little knowledge. Substitution ciphers, and other ciphers that repeat patterns are susceptible to these attacks.

Known Plaintext Attack

The known plaintext attack is a cryptanalytic attack in which the adversary has samples of both the plaintext and its ciphertext and is at liberty to make use of them to reveal further secret information. These types of attacks were carried out during WWII at Bletchley Park against the Germans.

Chosen Plaintext Attack

A chosen plaintext attack is any form of cryptanalysis which presumes that the adversary has the capability to choose arbitrary plaintexts to be encrypted and obtain the corresponding ciphertexts. This appears, at first glance, to be an unrealistic model; it would certainly be unlikely that an attacker could persuade a human cryptographer to encrypt large amounts of plaintexts of the attacker's choosing. Modern cryptography, on the other hand, is implemented in software or hardware and is used for a diverse range of applications; for many cases, a chosen-plaintext attack is often very feasible. In addition, any cipher that can prevent chosen-plaintext attacks is then also guaranteed to be secure against known plaintext and known ciphertext attacks

Replay Attack

A Replay attack is an attack in which a valid data transmission is maliciously or fraudulently repeated, either by the originator or by an adversary who intercepts the data and retransmits it. Although this attack does not glean any information about plaintexts or encryption keys, it is still dangerous as it is very difficult to detect. Since the adversary does not have to manipulate the ciphertext, digital signatures and message authentication schemes are of little help.

It is extremely difficult to detect malicious replay attacks performed by the originator of a message, especially since they are supposed to be cooperative. There are however, ways to detect replay attacks performed by outside adversaries such as requiring message transmissions to contain a sequence number. This defense is simple, yet effective since each time a legitimate user message will contain an updated sequence number that will be difficult for the adversary to reconstruct. Further, by encrypting the sequence number in the ciphertext, we can use message authentication and digital signature schemes to detect whether ciphertexts have been manipulated to update message sequence numbers.

References

http://www.cse.ucsd.edu/users/mihir/cse107/classnotes.html
http://www.cacr.math.uwaterloo.ca/hac/
http://en.wikipedia.org/wiki/