THE SHERLOCK CIPHER


If Sherlock fans ever had to communicate secretly, then the code they would choose would be anyone’s guess- The Dancing Men. In a time when the importance of  encryption had not yet caught the public eye, sir Arthur Conan Doyle came up with a cipher of his own!

Technically, the Dancing Men code is a mono-alphabetic cipher, a substitution cipher where each letter in the alphabet is replaced with another letter or symbol. The problem with this cipher was also pointed out by Holmes. The frequency of each letter could be used to break the cipher. So it was replaced by the Playfair cipher, which later gave way to poly-alphabetic ciphers.

 Anyway, coming back to the topic, though Doyle created a code, he did not use the whole range of letters in the story. So we do not have a complete cipher. Eight letters are missing namely F,J,K,Q,U,W,X and Z. Many have proposed their own Dancing Men ciphers. As a humble Sherlockian, I would like to propose my own theory regarding the matter.

Let us first analyse the data available to us. The symbols can be divided into two categories- symmetric and asymmetric. It is also seen that A,I,N,D and C are the mirror images of O,R,L,G and Y. Also T is the inverse of E.

From this, I came to the following conclusion.

Law 1
Each asymmetric symbol must have its mirror image.

Law 2
Each symmetric symbol must have its inverse.

This means that we got six symbols which haven’t been used. These are the mirror images of S,P & V and the inverse of B,H & M. So we need two more symbols. The only thing we know about the two symbols is that they are mirror images. This is when I noticed that D and G are the only inverted asymmetric symbols. So we can safely assume that there upright symbols are also present. Thus we have eight symbols for the eight letters. But how do we match them?

This is where the relative frequency of letters come into play. From The Adventures of The Dancing Men it is clear that Doyle was aware about it. So let us arrange the symbols in decreasing frequency.




  • The first two symbols (E & T) form inverse pairs.
  • The next two letters (A&O) are mirror images.
  • The mirror image of the next letter (I) comes 3 symbols later.
  • The mirror image of N comes 4 symbols later.
  • So the mirror image of S will come 5 symbols later i.e U is the mirror image of S.
  • Next asymmetric letter is D whose pair G comes after 6 symbols.
  • C and Y are separated by 5 symbols. So the next pair has a separation of 4 symbols.
  • The next known asymmetric letters are P and V. But P lies outside the pair C-Y. So the next pair must be F and V. P and J forms the pair after that.
  • After J, we put the last asymmetric pair without any separation. It will be noticed that there are many vacant spots in our crypt. This is filled by the inverse of symmetric symbols to obtain the complete code.



Please note that this is just a theory and may not be the correct code. Do share your opinions and theories on the same.

No comments:

Post a Comment

Demo