Tutorial¶
In this basic tutorial, we’ll see how we can use freqens to break some weak crypto.
Scenario¶
You are working for the NSA and find that some terrorists are using brainfuck programs to hack the whole universe. Some intelligence lets you know that they are using extremely weak crypto (single byte xor) in order to “secure” their communications, which you have been able to intercept.
In particular, you want to decode a ciphertext that looks like this:
5 555 P55'P5 7PP&5V77V7P5V5V5&P77 55&P77&&&&55&P&5V7VV\x017P7&7P7V 7 P5V77 5&555V7V7P7V5P&P5 7&V5P7 5&V
777 5 5\x01++P&\x01++++P77 5&5&\x01++++++P77P&V55&\x01++++++++P77 5 5&\x01++++++++++P77&&5&55
7&\x01++++++++++++P77 5 55&&7&\x01++++++++++++++P77&5&5&\x01++++++++++++++++P77 5 55 7&\x01++++++++++++++++++P5&7&\x0
1++++++++++++++++++++P77&5&5&\x01++++++++++++++++++++++P77&55&\x01++++++++++++++++++++++++P77
555&7&\x01++++++++++++++++++++++++++P77&&&&555 7&\x01++++++++++++++++++++++++++++P77 555
7&\x01++++++++++++++++++++++++++++++P5P&V7&\x01++++++++++++++++++++++++++++++++P77&555
7&\x01++++++++++++++++++++++++++++++++++P77&555&&7&\x01++++++++++++++++++++++++++++++++++++P77 5 55
7&\x01++++++++++++++++++++++++++++++++++++++P77P&V5&55 7&\x01++++++++++++++++++++++++++++++++++++++++P77 5
55&&7&\x01++++++++++++++++++++++++++++++++++++++++++P7&55
7\x01++++++++++++++++++++++++++++++++++++++++++++P77&55&\x01VVVVVVVVVVVVVVVVVVVVVV7P&55P77 55&V777P555 777&V7P555
777&VV55V\x015P&P&&&P&7VV5V5P P7 5&&V5V 7 PP5 7&V7V55P&%5V
Getting an analyzer¶
An analyzer represents the ideal frequency distribution your target plaintext has. Once it has been fed, it can be asked to score strings based on how legit they seem (how similar its frequency distribution is to the analyzer’s). There are several ways of building an analyzer.
- From a raw file: you can build an analyzer like
from freqens.analyzer import Analyzer # a representative sample of the target frequency distribution # ie. a normal brainfuck program filename = "./program.bf" analyzer = Analyzer.from_raw_file(filename)
- From a frequency distribution file: which is a json file containing some absolute frequencies.
- An example of a frequency distribution file would be:
{ "a": 4, "b": 3 }
- This is how you build an analyzer from one of these files:
from freqens.analyzer import Analyzer filename = "./bf-distribution.json" analyzer = Analyzer.from_file(filename)
- From a string:
from freqens.analyzer import Analyzer analyzer = Analyzer("representative text")
- For this scenario, the easiest way to build the bf analyzer is to use the
freqens
command line utility which lets you extract a frequency distribution file from an specified set of files. For example: $> cd my-bf-programs $> freqens *.bf > bf_frequency_distribution.json
Breaking the code¶
Now that you know how to get a brainfuck analyzer, it’s time to break the code. We’ll decode the ciphertext with every possible key (as it is single byte xor, there’s only 256 possible keys) and let the analyzer discover what is the real ciphertext. Our program will look like:
from freqens.analyzer import Analyzer
def single_byte_xor(text, byte):
return "".join( chr(c ^ byte) for c in bytearray(text) )
with open("ciphertext.txt") as ciphertext_file:
ciphertext = ciphertext_file.read()
analyzer = Analyzer.from_file("bf_frequency_distribution.json")
possible_plaintexts = ( single_byte_xor(ciphertext, byte) for byte in range(256) )
answer = analyzer.choose_best(possible_plaintexts)
print answer[0] # Solution !!!
And the program will print:
+++>++>>>+[>>,[>+++++<[[->]<<]<[>]>]>-[<<+++++>>-[<<---->>-[->]<]]
<[<-<[<]+<+[>]<<+>->>>]<]<[<]>[-[>++++++<-]>[<+>-]+<<<+++>+>
[-
[<<+>->-
[<<[-]>>-
[<<++>+>-
[<<-->->>+++<-
[<<+>+>>--<-
[<<->->-
[<<++++>+>>+<-
[>-<-
[<<->->-
[<<->>-
[<<+++>>>-<-
[<<---->>>++<-
[<<++>>>+<-
[>[-]<-
[<<->>>+++<-
[<<->>>--<-
[<<++++>+>>+<-
[<<[-]>->>++<-
[<<+++++>+>>--<-
[<->>++<
[<<->>-
]]]]]]]]]]]]]]]]]]]]]]<[->>[<<+>>-]<<<[>>>+<<<-]<[>>>+<<<-]]>>]
>[-[---[-<]]>]>[+++[<+++++>--]>]+<++[[>+++++<-]<]>>[-.>]
Which is obviously an Universal Turing Machine! Now you know terrorists have turing-complete technology in their hands.