View Full Version : Help with Python Best Practices

05-03-2010, 04:26 AM
I am new to Python and I am hoping some veterans of the language could give me some pointers on my first program.

My goal was to make a program like XORSearch; that is, it will open a binary file and find a given string that has been XORed with any 1 byte value 0-255.

Basically the idea is get a word (in this case, MESSAGE) and if each letter in that word XORed with the first len('MESSAGE') values of the file data results in the same number then we have our XOR key. If any are different, then it's no match and we shift the word over one and try again - repeating till the end of file.

Here is the code:

file = open('file', 'rb')
data = file.read()
# Plaintext string to search for
string = 'MESSAGE'
# Loop to iterate through file data
for x in range(0, len(data)):
# Stop before end of file
if x+len(string) > len(data):
# XOR every letter in our string with
# the current data from the file
xored = []
for y in range(len(string)):
# I don't care about XORing the zeros
# so use -- to skip them in output
if ord(data[x+y]) == 0:
# Output the result into a new list
xored.append(('%02X' % (ord(data[x+y]) ^ ord(string[y]))))

match = True
for z in range(len(xored) - 1):
# If one or more of the values differs
# from the other, it was not the same
# XOR key and there is no match
if xored[z] == '--' or (xored[z] != xored[z+1]):
match = False

if match:
print 'Match found at offset 0x%02X with key: 0x%s' % (x, xored[0])
print 'Done'

I'm trying to code things in Python "the Python way" but while I was coding this I found myself stuck doing things a way that seemed very much like the I would code it in other languages like C/C++, C#, Java or PHP. The parts of the code that concern me most are the for loops; I feel like for x in range(0, len(data)): is just forcing Python to do a C style forloop (for(int x=0; x<len(data); x++))

I'd really like to use Python for making programs quickly and easily readable and during my reading I came across a few new ideas (like comprehensions and slicing) that I thought might apply in this script but couldn't get them to apply.

If anyone could glance over my code and give me some tips on how to better do it the Python way, it'd be appreciated. Is there an easier way to iterate over a sequence like a string like I am trying to do?

Thanks in advance.

05-05-2010, 09:33 PM
I don't know which is better, but here's another way of looking at it:

# Open and read the data.
data = open("testfile.txt").read()

# Plaintext string to search for.
string = "MESSAGE"

x = 0
# While loop to iterate through file data using while instead of for; gets
# rid of the break condition for when the iteration gets to the end of data.
while (x + len(string)) < len(data):

# List comprehension for generating the xored list. I heard it's faster than
# the typical for loop.
xored = ["%02X" % (ord(data[x+y]) ^ ord(string[y])) for y in range(len(string))]

# List comprehension to evaluate the contents of xored. Basically, tries to
# generate a list of dissimilar items from xored. If it fails (all items are the same),
# acts as the "matched = True" condition of the original script.
if not [z for z in xored if z != xored[0]]:
print "Match found at offset 0x%02X with key: 0x%s" % (x, xored[0])

# Increment x.
x += 1

print "Done"

Disclaimer: I'm also a n00b to Python. :D

05-12-2010, 03:47 AM
Thanks a lot! Functionally the same but in about 1/2 the lines of code. This is EXACTLY what I was looking for.

Thanks again.

05-12-2010, 06:21 AM
No problem. :D