ου γαρ εστιν κρυπτον ο ου φανερον γενησεται ουδε αποκρυφον ο ου γνωσθησεται και εις φανερον ελθη
Wersja PL ENG Version

Simple XOR cipher

  • Description
  • Algorithm
  • Implementation
The simple XOR cipher was quite popular in early times of computers, in operating systems MS-DOS and Macintosh.
Polyalphabetic substitution cipher
Despite its simplicity and susceptibility to attacks, the cipher was used in many commercial applications, thanks to its speed and uncomplicated implementation.

The simple XOR cipher is a variation of the Vigenère cipher. It differs from the original version because it operates on bytes, which are stored in computer memory, instead of letters.

Instead of adding two alphabet letters, as in the original version of the Vigenère cipher, the XOR algorithm adds subsequent plaintext bytes to secret key bytes using XOR operation. After using the last secret key byte, one should return to the first byte (as in the Vigenère encryption).

In order to decrypt ciphertext bytes, one should take the same steps as during encryption. Subsequent ciphertext bytes should be added to subsequent secret key bytes using XOR operation.

Both encryption and decryption can be presented using the following equations:
    M XOR K = C
    C XOR K = M

Security of the simple XOR cipher

The simple XOR cipher is quite easy to break. It doesn't offer better protection that some other classical polyalphabetic substitution ciphers. Using a computer, it is possible to break the cipher in a relatively short time.

Almost always, the first step to break the cipher should be guessing a length of the secret key. It can be easily achieved by calculating an index of coincidence of the ciphertext.

After determining the length of the key, one should write down the same ciphertext in two lines, one under another. Bytes in the lower line should be offset by the secret key size with respect to the same bytes in the upper line. Then, after adding XOR both texts (after adding each two bytes in the same columns), one will receive a sequence of bytes without secret key modifications.

Thanks to the redundancy of information in languages stored in binary form as bytes, it is possible to guess the original message letters based on the received bytes.

The application written in C, that encrypt a given text file using a simple XOR cipher:

#include <stdio.h>

int main (int argc, char *argv[])
{
  FILE *fi, *fo;
  char *cp;
  int c;

  if ((cp = argv[1]) && *cp!='\0') {
    if ((fi = fopen(argv[2], "rb")) != NULL) {
      if ((fo = fopen(argv[3], "wb")) != NULL) {
        while ((c = getc(fi)) != EOF) {
          if (!*cp) cp = argv[1];
          c ^= *(cp++);
          putc(c,fo);
        }
        fclose(fo);
      }
      fclose(fi);
    }
  }
  return 0;
}

Usage:
    program_name key input_file output_file