No. Maybe… See comments below.
Ever searching for a new way to attempt a K4 solution, I stumbled upon a series of ciphers that utilize a Polybius square.
In cryptography, the Polybius square, also known as the Polybius checkerboard, is a device invented by the Ancient Greek historian and scholar Polybius for fractionating plaintext characters so that they can be represented by a smaller set of symbols. (Wikipedia)
In classical cryptography, the bifid cipher is a cipher which combines the Polybius square with transposition, and uses fractionation to achieve diffusion. It was invented around 1901 by Felix Delastelle. (Wikipedia)
For those unfamiliar with why these types of ciphers caught my interest, it’s due to the fact that the methods used in enciphering text could potentially give rise to the letter frequencies observed in K4 where they are more flattened than you would see in a substitution or straight transposition.
Using a Polybius square:
1 | 2 | 3 | 4 | 5 | |
---|---|---|---|---|---|
1 | A | B | C | D | E |
2 | F | G | H | I/J | K |
3 | L | M | N | O | P |
4 | Q | R | S | T | U |
5 | V | W | X | Y | Z |
I converted the text of K4 into numbers.
34-12-25-42-45-34-53-45-22-23-45-31-12-43-34-31-24-21-12-12-52-21-31-42-51
41-41-35-42-33-22-25-43-43-34-44-52-44-41-43-24-41-43-43-15-25-55-55-52-11
44-24-25-31-45-14-24-11-52-24-33-21-12-33-54-35-51-44-44-32-55-21-35-25-52
22-14-25-55-53-44-24-13-14-24-22-25-45-11-45-15-25-13-11-42
In a Bifid cipher you write the coordinates vertically then read them off by rows then pair those numbers and use the Polybius square to translate.
In order, from the top row to the second and then paired:
31-24-43-54-22-43-14-33-22-11-52-34-54-43-43
22-44-34-54-44-24-44-12-55-51-42-23-41-21-52
32-13-53-54-43-52-32-52-12-55-42-11-22-24-24
14-12-11-44-25-25-43-52-35-12-34-14-12-22-11-21
11-52-32-53-34-42-41-34-13-35-55-52-14-45-15-44
12-43-12-34-51-44-25-15-52-24-55-34
43-44-25-53-51-55-53-12
Using the regular Polybius square to translate:
L I/J S Y G S D N G A W O for the beginning which is enough gibberish that I chose not to translate it all.
Knowing that Bifid ciphers usually used a mixed Polybius square, I used a keyed Polybius square as well (keyword = KRYPTOS)
D B M X S M P F S K V G for the beginning which was gibberish as well.
I feel pretty confident saying that K4 is not a Bifid cipher.
What if the digits were fractionated via transposition before being turned into the final ciphertext?
1. PT to number coordinates
2. Number coordinates written into a transposition matrix horizontally and the columns transposed
3. Final string of digits written out column by column
4. New order of the digits turned into CT via the same polybius square used in step 1
Possible but the problem is how to reverse it and knowing when to stop when you don’t immediately retrieve the plaintext. It’s plausible and possible but not guaranteed and therefore a potential timesink unless one is careful. An encouragement to be careful is not a discouragement of the attempt however. Try and reverse the process and maybe you’ll save us all some trouble.
I have an idea about that if you have time to kick it around. I know what a timesuck it could be with no real way to verify…..I hear you loud and clear there
Sure!
Best way to get in touch? Email would probably work better than trying to cram it into a comment, but whatever works best.
Up to you. Put it here in pieces if needed or just make it a long one. If you want it up for anyone and everyone to mull over, just comment on my front page. I know Titus Groan is actively interested and despite my lack of ideas it doesn’t mean I’m not interested in trying it.
Here are the basics. Let’s take a hypothetical bifid example.
Due to the top row containing all the first digits in the pairs (the row coordinate) and the bottom row containing all the second digits in the pairs (the column coordinate) these two sets of digits will have their own disctinct distributions.
For example, let’s say the top row of digits mostly consists of 1s and 3s, with a smattering of 5s, and very few 2s and 4s. Meanwhile. the bottom row consists mostly of 2s and 4s, again with a smattering of 5s and very few 1s and 3s.
When converting your digits back into letters using the original polybius square the top row is going to have greater incidence of the following “letters”:
11, 13, 31, 33, 51, 15, 53, 35
Correspondingly, it will have very, very few occurences of 22, 44, 24, 42.
The “letters”: 14, 41, 34, 43, 21, 12, 14, 41, 32, 23, 54, 45 and 55 will occur occassionally.
Conversely, in the bottom row the following is true:
Primarily: 22, 24, 42, 44, 25, 52, 45, 54
Occassionaly: 21, 12, 23, 32, 41, 14, 43, 34, 51, 15, 53, 35, and 55
Rarely: 11, 13, 31, 33
The key to breaking a 3 step encryption like this (substituion, fractionation/diffusion, substitution) is recognizing that this has even happened. If K4 was a bifid you’d likely see a very different distribution in the first 48 characters and the last 49 characters. However, if instead it was transposed using an even number of columns for the transposition key (thus preserving chunks of column coordinates and chunks of row coordinates), you’d expect to see clumps of letters showing up together, and not showing up together, with some letters spanning the inbetween. Now take a look at the occurence of each letter in K4:
A – 50, 58, 91, 96
B – 2, 13, 19, 20, 63
C – 83, 95
D – 56, 77, 84
E – 45, 93
F – 18, 22, 62, 72
G – 9, 31, 76, 86
H – 10, 89
I – 17, 57, 60, 85
J – 41, 52, 82
K – 3, 32, 46, 53, 74, 78, 87, 94
L – 12, 16, 23, 54
M – 70
N – 30, 61, 64
O – 1, 6, 8, 15, 35
P – 28, 66, 73
Q – 26, 27, 39, 42
R – 4, 24, 29, 97
S – 14, 33, 34, 40, 43, 44
T – 36, 38, 51, 68, 69, 81
U – 5, 11, 55, 88, 90, 92
V – 25, 67
W – 21, 37, 49, 59, 75
X – 7, 80
Y -65
Z – 47, 48, 71, 79
You can see how letters show up for short periods of time and then disappear for longer stretches only to come back later. I have a lot of follow up commenting on the idea, but I’d rather leave it here and let you swim through this part.
The association between groups of letters is the key, because they are built on underlying polybius square coordinates (at least if we’re assuming this is true – which I’m just hypothesizing here).
It’s a good argument if you leave out the last half. That might sound insane but consider that the distribution would indeed contain bits of data about the original plaintext input. However, the intervals of each alphabetical letter in the ciphertext would be artifact to some extent. I have a great weakness for the polybius square cipher systems for K4 because they just seem to make sense. The intervals between ciphertext letters makes one want to make assertions about a possible transposition but that would potentially have to remain separate for the time being from any considerations into the mechanics of a bifid ciphering system.
I doubt Scheidt would have specifically guided Sanborn with a custom mixed alphabet Polybius designed to keep us from back-tracking so I don’t think we need to second guess in that way.
The actual distribution of numbers would be hard to reverse engineer. Let’s say he pseudo-randomly put E in row 1, column 5 then T in row 4, column 2 and then put S in the middle etc. it would mean that they would occur more often but when transposed and the pairs read off, it could be a crapshoot on how the ciphertext would turn out.
There’s a flicker in the back of my brain but I would rather see this through all the way instead of just cramping someone’s style. Let’s formalize it in words on how to reverse it for K4 and then give it a good shot. Not like I’m doing more than spinning my wheels right now. It’d be nice to feel useful even if it was just moral support for someone else.
What are the other follow-up comments?
The “crapshoot” you speak of is what would nicely disguise the method. Even a bifid with a period (rather than all x coordinates followed by all the y cooridnates) would produce a similar output to what I suggested.
I’d be up for treating K4 as a bifid with a period and using BERLIN as a crib as you did previously here:
https://kryptosfan.wordpress.com/speculations/known-plaintext-attacks/known-plaintext-attack-bifid/
For instance, we could hypothesize that the crib was intended to help us identify the period as well as provide a known plaintext attack. Splitting K4 into groups of 8 characters would mean:
N = Bx Ex
Y = Rx Lx
P = Ix Nx
V = Unknown
T = By Ey
T = Ry Ly
M = Iy Ny
Z = Unknown
I’ve never heard of a bifid with a period before, how’d you think of that?
How do you compensate for the odd number of letters?
You can quickly google ACA Bifid and see an example. If the period is 8 and you have 1 letter left over the last ciphertext letter would look like the last plaintext letter.
Hmm…
This gets more and more interesting…
Alright, it could very well be a period bifid with a non-keyed polybius square or a conjugated matrix bifid.
And bifid sucks me back in…
Agreed. I’m currently working through a solving a periodic bifid by hand, with known plaintext and a known period so that I can see how it works. Judging by your other post you are likely further along in terms of how to attack it. Unsure if you follow the yahoo group discussion, but I posted some of these thoughts there as well.
I would want to see if a periodic bifid has the ability to generate a final ciphertext IC < 3.8%
Did you look at the IC at all for this system during your research? Otherwise my plan was to code a few examples. My gut tells me the IC is somewhat low for K4 to begin with. Maybe in the 5.5% range which is what allows the IC to get to the 3.6% level after it was encrypted. From all of the systems I've looked at it's incredibly tough to get an IC < 3.8% unless the IC of the plaintext starts well below the standard 6.7% seen in English.
I’m by no means an expert but it seems like English has an IC of 1.73 (Friedman’s Military Cryptanalytics, Part I – Volume 2) so I’m not sure where you’re getting your initial IC values.
We’re just talking in terms of different scales. Your 1.73 comes from 26*6.7%.
Translating to your number, can a periodic bifid generate an end ciphertext with an IC of 0.93? I know that K1 has an IC at this level, but it’s a much shorter piece of text, making it more likely to achieve that level.
What about the use of null characters? The likelihood of a piece of ciphertext only 97 letters long having at least 1 of all 26 letters has always seemed suspicious to me. Perhaps it was carefully seeded to throw of analysis. -Devils Advocate
I have to warn you Sharpe, I feel comfortable with a lot of cipher techniques and information but I’m really not an expert. If I say the wrong thing or don’t understand what you’re talking about right away, please don’t be terribly frustrated, I’ll get there eventually.
I’m in the same boat – no worries. My feeling is that whatever was devised was modified somehow, although I’m not sure how.
However, back to the earlier discussion. I agree with you about transposition. With bifid you don’t have to recover the intermediate ciphertext as it was originally written, you just need the association between the letters and the x-y coordinates. If it were fractionated, transposed and then had a last layer of substitution that would be pretty impossible to solve.
In regard to null characters. I think it would be easier to make the final ciphertext look like English (1.73) by adding null characters than it would be to make it look really, really random. If we take NYPVTT = BERLIN at face value, every character decrypts to another character. If nulls were added it would make more sense to add those characters after the final ciphertext was obtained.
I wonder if there isn’t a way to modify bifid ciphertext or the process to make it appear more random. Again, I still haven’t encoded any ciphertext to see what it looks like or how I could modify it. Hoping to have some time tonight.
Depending on how you understand the mechanism, it’s possible to be strict with the plaintext to easily manipulate the ciphertext output. I even made up my own cipher based on the idea: http://tinyurl.com/4hm2pkz.
Very interesting. This morning I stumbled onto another thought as well. And I might as well share it.
I found it interesting that NYPVTT decodes to BERLIN and in the process, if it were a vig substitution, produces ELOYIE as the partial key that is revealed. Interesting because ELOY(I)E are all in the phrase LAYERTWO.
Keeping in mind that K4 can contain a potential spelling error in the CT, there could be an out of place character in the keystream. Consider a situation where the key was “LAYERTWO” repeated in various transpositions.
When you got to the 64th character in the key, you would have the last letter that was not used from the last repetition of LAYERTWO, which in this case could have BEEN “ALOWRTYE”. The next section of 8 letters could be encrypted beginning “LOY*E” where the * represents the I at the spot of a mispelling in the CT. It would be a crafty way to give us a plaintext clue, not reveal that the keyword is just a transposition of “LAYERTWO”, especially since there are two Es in the span of 6 letters.
Just a thought. I coded some examples quickly and noticed a fair amount of doubles (small sample size).
It’s an interesting thought but might just be a coincidence?
Completely agreed. Other than bifid, are there other systems where there is an intermediate ciphertext, but you don’t need to know the actual intermediate ciphertext?
I tend to agree with your thought from earlier where you stated systems that have a transposition step that relies on you recovering the intermediate ciphertext are probably out of play.
Not to dip from the same well too often but any of the more complicated algorithms would qualify just like the Rubix cube. That’s why so many people can’t just solve it on their own because we get caught up in knowing where every square goes but the later moves require you to just orient and then blindly twist twist twist. Even at the end, it seems like you’re screwing it up but it always resolves itself in the end.
I started defining some ground rules for checking for a periodic bifid.
Each letter is written as an X and Y coordinate from the letters from the corresponding period. Two rules should be used to evaluate each period. The first is that no combination of x and y coordinate with the same parent letter can represent more than one letter. Unless it’s the “IJ” square or whichever letters are doubled up. For instance:
In period 7 for K4:
L = Ty Jy at position 54
C = Ty Jy at position 83
The only way this can be is if “LC” were in a square together in the polybius square. This isn’t the only instance of this in period 7 for K4. There are several times this happens.
Secondly, no more than 5 (or 6) letters can by encoded by a natural double. For instance in K4, period 7:
I = Bx Bx at position 17
R = Qx Qx at position 24
G = Sx Sx at position 31
S = Sx Sx at position 43
E = Zx Zx at position 45
P = Tx Tx at position 66
This is because there can only be 5 (or 6) letters that line up on the diagonal of a polybius square.
This might help:
http://rumkin.com/tools/cipher/bifid.php
well it looks like I was able to prove the previous work from March 4 in this thread, which is very exciting.
Good work!
kryptosfan, I like your site. I just started looking at Kyrptos4 yesterday. Frequency distribution looks exactly like a bifid with an even period because it is leveled out to some extent, but not entirely. There is a high count of period 21 bigram repeats, and another small spike at 42. Redraft the message into 21 columns and you will see what I mean. At that period, there are two repeating trigrams. I find it interesting that the clue BERLINCLOCK starts at the 64th position, which is 21*3+1. A bifid that encodes plaintext in chunks of 42 plaintext would create period 21 bigram repeats, but not so many at period 42. I shuffled the message 500 times to see if you could have as many period 21 bigram repeats, and that was easy to duplicate. However, out of the 500 shuffles, there were only four where there were as many period x repeats as with period 21, and also as many period x * 2 repeats as with period 42. Given that there are so many period 21 bigram repeats, and the clue starts at the 64th position, I think that the period 21 repeats may be significant.