Load and modify YAML without breaking the indents? - python

Hi wanted to update this integer value of x to 8 but I could not get any good understanding on how can I do that without making any effect in indents inside sub in yaml file. I wanted to read this YAML file, replace the string to x=8 and save the yaml file as it is.
I am using Python for the modification, here is the sample code:
parent:
-
subchild: something
subchild2: something
- sub:
y = 4;
x = 6 # I wanted to replace this integer to 8
z = 10
Note Point: x=6 will be in multiple files, so I wanted to open files one by one, and do all the modifications ( x=8) and save those files one by one.
Problem
I was able to replace the file but what the problem I am facing is, the result becomes this:
parent:
-
subchild: something
subchild2: something
- sub: y = 4; x = 8; z = 10;
And what I want is the same indents inside sub as in the original yaml file. Hope you got point here.
[EDIT] Additional information:
This is the way I am reading the file and saving the file.
f = open(test.yaml, 'r')
newf = f.read().replace('6', '8')
overrides = yaml.load(newf)
f.close()
with open('updated_test.yaml', 'w') as ff:
yaml.dump(overrides, ff)

Just don't use the yaml module for this task at all, as you aren't doing anything what it is needed for ! ;)
f = open("test.yaml", 'r')
newf = f.read().replace('6', '8')
f.close()
with open('updated_test.yaml', 'w') as ff:
ff.write(newf)

Your input is invalid YAML, as you cannot start an indented sequence after the second something.
Your output is invalid as well.
If you want something like:
y = 4;
x = 6 # I wanted to replace this integer to 8
z = 10
in your YAML document as a multiline string, and preserve the multiple lines you
need to use ruamel.yaml and specify the string as a literal block scalar (using |).
And since x = 6 is part of the string, you need to do a string replacement to
change the value. I would use a regex for that:
import sys
import re
import ruamel.yaml
yaml_str = """\
- sub: |
y = 4;
x = 6 # I wanted to replace this integer to 8
z = 10
"""
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
# yaml.preserve_quotes = True
data = yaml.load(yaml_str)
target = data[0]['sub']
new_value = 8
data[0]['sub'] = type(target)(re.sub('x = [0-9]*', f'x = {new_value}', target))
yaml.dump(data, sys.stdout)
which gives:
- sub: |
y = 4;
x = 8 # I wanted to replace this integer to 8
z = 10

Related

Reading variable length binary values from a file in python

I have three text values that I am encrypting and then writing to a file. Later I want to read the values back (in another script) and decrypt them.
I've successfully encrypted the values:
cenc = rsa.encrypt(client_name.encode('utf8'), publicKey)
denc = rsa.encrypt(expiry_date.encode('utf8'), publicKey)
fenc = rsa.encrypt(features.encode('utf8'), publicKey)
and written to a binary file:
licensefh = open("license.sfb", "wb")
licensefh.write(cenc)
licensefh.write(denc)
licensefh.write(fenc)
licensefh.close()
The three values cenc, denc and fenc are all of different lengths so when I read the file back:
licensefh = open("license.sfb", "rb")
encMessage = licensefh.read()
encMessage contains the entire file and I don't know how to get the three values back again.
I've tried using a separator between the values:
SEP = bytes(chr(0x02).encode('utf8'))
...
licensefh.write(cenc)
licensefh.write(SEP)
...
and then using encMessage.partition(SEP) or encMessage.split(SEP) but the data invariably contains the SEP value in it somewhere (I've tried a few different characters) so that didn't work.
I tried getting the length of the bytes objects cenc, denc and fenc, but this returned 256 for each value even though the contents of the variables are all different lengths.
My question is this. How do I write these three variable length values to a binary file and then separate them when I read them back again?
Here's an example of the 3 binary values:
b'tX\x10Fo\x89\x10~\x83Pok\xd1\xfb\xbe\x0e<a\xe5\x11md:\xe6\x84#\xfa\xf8\xe5\xeb\xf8\xdc{\xc0Z\xa0\xc0^\xc1\xd9\x820\xec\xec\xb0R\x99/\xa2l\x88\xa9\xa6g\xa3\x01m\xf9\x7f\x91\xb9\xe1\x80\xccs|\xb7_\xa9Fp\x11yvG\xdc\x02d\x8aK2\x92t\x0e\x1f\xca\x19\xbb&\xaf{\xc0y>\t|\x86\xab\x16.\xa5kZ"\xab6\xaaV\xf4w\x7f\xc5q\x07\xef\xa9\xa5\xa3\xf3 6\xdb\x03\x19S\xbd\x81\xf9\xc8\xc5\x90\x1e\x19\x86\xa4q\xe3?i\xc4\xac\t\xd5=3C\x9b#\xc3IuAN,\xeat\xc6\x96VFL\x1eFWZ\xa4\xd73\x92P#\x1d\xb9\x12\x15\xc9\xd4~\x8aWm^\xb8\x8b\x9d\x88\n)\xeb#\xe3\x93\xb1\\\xd6^\xe0\xce\xa2(\x05\xf5\xe6\x8b\xd1\x15\xd8v\xf0\xae\x90\xd8?\x01\r\x00\xf4\xa5\xadM|%\x98\xa9SR\xc6\xd0K\x9e&\xc3\xe0M\x81\x87\xdea\xcc\xd5\x9c\xcd\xfd1l\x1f\xb9?\xed\xd1\x95\xbc\x11\x85U9'
b'l\xd3S\xcc\x03\x9a\xf2\xfdr\xca\xbbA\x06\xfb\xd8\xbbWi\xdc\xb1\xf6&\x97T\x81Kl\r\x86\x9b\x95?\x94}\x8a\xd3\xa1V\x81\xd3]*B\x1f\x96`\xa3\xd1\xf2|B\x84?\xa0\ns\xb7\xcf\x18Y\x87\xcfR\x87!\x14\x81!\xf7\xf2\xe5x|=O\xe3\xba2\xf2!\x93\x0fT7\x0c~4\xa3\xe5\xb7\xf9wy\xb5\x12FM\x96\xd9\xfd\xedn\x9c\xacw\x1b\xc2\x17+\xb6\x05`\x10\xf8\xe4\x01\xde\xc7\xa2\xa0\x80\xd8\x15\xb1+<s\xc7\x19\x9c\x14\xb0\x1a"\x10\xbb\x0f\xe1\x05\x93\xd2?xX\xd9\x93\x8an\x8d\xcd\xbd!c\xd0,\xa45\xbai\xe3\xccx\x08\xaa,\xd1\xe5\'t\x91\xb8\xf2n$\x0c\xf9-\xb4\xc2\x07\x81\xe1\xe7\x8e\xb3\x98\x11\xf3\xa6\xd9wz\x9a3\xc9\x9c?z\xd8\xaa\x08}\xa2\x9c[\xf2\x9d\xe4\xcdb\xddl\xceV\x7f\xf1\x81\xb3\x88\x1e\x9c5?k\x0f\xc9\x86\x86&\xedV.\xa7\x8d\x13&V\xad\xca\xe5\x93\xfe\xa5\x94\xbc\xf5\xd1{Cl\xc0\x030\x92\x03\xc9'
b'#\xbdd7\xe9\xa0{\t\xb9\x87B\x9e\xf9\x97P^\xf3V\xb6\x93\x1f(J\x0b\xa3\xbf\xd8\x04\x86T\xa4\xca\xf3\xe8%\xddC\x11\xdb5\xff,\xf7\x13\xd7\xd2\xbc\xf3\x893\x83\xdcmJ\xc8p\xdf\x07V\x7fb\xeb\xa9\x8b\x0f\xca\xf9\x05\xfc\xdfS\x94b\x90\xcd\xfcn?/]\x11\xaf\xe606\xfb\\U59\xa0>\xbd\xd8\x1c\xa8\xca\x83\xf4C\x95v7\xc6\xe00\xe4,d_/\x83\xa0\xb9mO\x0e\xc4\x97J\x15\xf0\xca-\xa0\xafT\xe4\x82\x03\n\x14:\xa1\xdcL\x98\x9d,1\xfa\x10\xf4\xfd\xa0\x0b\xc7\x13!\xf7\xdb/\xda\x1a\x9df\x1cQ\xc0\x99H\x08\xa0c\x8f9/4\xc4\x05\xc6\x9eM\x8e\xe5V\xf8D\xc3\xfd\xad4\x94A\xb9[\x80\xb9\xcf\xe6\xd9\xb3M2\xd9N\xfbA\x18\x84/W\x9b\x92\xfe\xbb\xd6C\x85\xa3\xc6\xd2T\xd0\xb2\xb9\xf7R\xb4(s\xda\xbcX,9w\x17\x1c\xfb|\xa0\x87\xba\xca6>y\xba\\L4wc\x94\xe7$Y\x89\x07\x9b\xfe\x9b?{\x85'
#pippo1980 's comment is how I would do it, using struct :
import struct
cenc = b'tX\x10Fo\x89\x10~\x83Pok\xd1\xfb\xbe\x0e<a\xe5\x11md:\xe6\x84#\xfa\xf8\xe5\xeb\xf8\xdc{\xc0Z\xa0\xc0^\xc1\xd9\x820\xec\xec\xb0R\x99/\xa2l\x88\xa9\xa6g\xa3\x01m\xf9\x7f\x91\xb9\xe1\x80\xccs|\xb7_\xa9Fp\x11yvG\xdc\x02d\x8aK2\x92t\x0e\x1f\xca\x19\xbb&\xaf{\xc0y>\t|\x86\xab\x16.\xa5kZ"\xab6\xaaV\xf4w\x7f\xc5q\x07\xef\xa9\xa5\xa3\xf3 6\xdb\x03\x19S\xbd\x81\xf9\xc8\xc5\x90\x1e\x19\x86\xa4q\xe3?i\xc4\xac\t\xd5=3C\x9b#\xc3IuAN,\xeat\xc6\x96VFL\x1eFWZ\xa4\xd73\x92P#\x1d\xb9\x12\x15\xc9\xd4~\x8aWm^\xb8\x8b\x9d\x88\n)\xeb#\xe3\x93\xb1\\\xd6^\xe0\xce\xa2(\x05\xf5\xe6\x8b\xd1\x15\xd8v\xf0\xae\x90\xd8?\x01\r\x00\xf4\xa5\xadM|%\x98\xa9SR\xc6\xd0K\x9e&\xc3\xe0M\x81\x87\xdea\xcc\xd5\x9c\xcd\xfd1l\x1f\xb9?\xed\xd1\x95\xbc\x11\x85U9'
denc = b'l\xd3S\xcc\x03\x9a\xf2\xfdr\xca\xbbA\x06\xfb\xd8\xbbWi\xdc\xb1\xf6&\x97T\x81Kl\r\x86\x9b\x95?\x94}\x8a\xd3\xa1V\x81\xd3]*B\x1f\x96`\xa3\xd1\xf2|B\x84?\xa0\ns\xb7\xcf\x18Y\x87\xcfR\x87!\x14\x81!\xf7\xf2\xe5x|=O\xe3\xba2\xf2!\x93\x0fT7\x0c~4\xa3\xe5\xb7\xf9wy\xb5\x12FM\x96\xd9\xfd\xedn\x9c\xacw\x1b\xc2\x17+\xb6\x05`\x10\xf8\xe4\x01\xde\xc7\xa2\xa0\x80\xd8\x15\xb1+<s\xc7\x19\x9c\x14\xb0\x1a"\x10\xbb\x0f\xe1\x05\x93\xd2?xX\xd9\x93\x8an\x8d\xcd\xbd!c\xd0,\xa45\xbai\xe3\xccx\x08\xaa,\xd1\xe5\'t\x91\xb8\xf2n$\x0c\xf9-\xb4\xc2\x07\x81\xe1\xe7\x8e\xb3\x98\x11\xf3\xa6\xd9wz\x9a3\xc9\x9c?z\xd8\xaa\x08}\xa2\x9c[\xf2\x9d\xe4\xcdb\xddl\xceV\x7f\xf1\x81\xb3\x88\x1e\x9c5?k\x0f\xc9\x86\x86&\xedV.\xa7\x8d\x13&V\xad\xca\xe5\x93\xfe\xa5\x94\xbc\xf5\xd1{Cl\xc0\x030\x92\x03\xc9'
fenc = b'#\xbdd7\xe9\xa0{\t\xb9\x87B\x9e\xf9\x97P^\xf3V\xb6\x93\x1f(J\x0b\xa3\xbf\xd8\x04\x86T\xa4\xca\xf3\xe8%\xddC\x11\xdb5\xff,\xf7\x13\xd7\xd2\xbc\xf3\x893\x83\xdcmJ\xc8p\xdf\x07V\x7fb\xeb\xa9\x8b\x0f\xca\xf9\x05\xfc\xdfS\x94b\x90\xcd\xfcn?/]\x11\xaf\xe606\xfb\\U59\xa0>\xbd\xd8\x1c\xa8\xca\x83\xf4C\x95v7\xc6\xe00\xe4,d_/\x83\xa0\xb9mO\x0e\xc4\x97J\x15\xf0\xca-\xa0\xafT\xe4\x82\x03\n\x14:\xa1\xdcL\x98\x9d,1\xfa\x10\xf4\xfd\xa0\x0b\xc7\x13!\xf7\xdb/\xda\x1a\x9df\x1cQ\xc0\x99H\x08\xa0c\x8f9/4\xc4\x05\xc6\x9eM\x8e\xe5V\xf8D\xc3\xfd\xad4\x94A\xb9[\x80\xb9\xcf\xe6\xd9\xb3M2\xd9N\xfbA\x18\x84/W\x9b\x92\xfe\xbb\xd6C\x85\xa3\xc6\xd2T\xd0\xb2\xb9\xf7R\xb4(s\xda\xbcX,9w\x17\x1c\xfb|\xa0\x87\xba\xca6>y\xba\\L4wc\x94\xe7$Y\x89\x07\x9b\xfe\x9b?{\x85'
packing_format = "<HHH" # little-endian, 3 * (2-byte unsigned short)
with open("license.sfb", "wb") as licensefh:
licensefh.write(struct.pack(packing_format, len(cenc), len(denc), len(fenc)))
licensefh.write(cenc)
licensefh.write(denc)
licensefh.write(fenc)
# close is automatic with a context-manager
with open("license.sfb", "rb") as licensefh2:
header_length = struct.calcsize(packing_format)
cenc2_len, denc2_len, fenc2_len = struct.unpack(packing_format, licensefh2.read(header_length))
cenc2 = licensefh2.read(cenc2_len)
denc2 = licensefh2.read(denc2_len)
fenc2 = licensefh2.read(fenc2_len)
assert len(cenc2) == cenc2_len and len(denc2) == denc2_len and len(fenc2) == fenc2_len # the file was not truncated
unread_bytes = licensefh2.read() # until EOF
assert len(unread_bytes) == 0 # there is nothing else in the file, everything has been read
assert cenc == cenc2
assert denc == denc2
assert fenc == fenc2

How to get read strings from a list(a txt file) and print them out as ints, strings, and floats?

I've tried literally everything to make this work. What i'm trying to do is take a file, assign each line a variable, and then set the type of the variable. It's reading it in the list as [ and ' being a line number, and I don't know what to do. I also have lists inside of the file that I need to save.
My error is:
ValueError: invalid literal for int() for base 10: '['
My code is:
def load_data():
f = open(name+".txt",'r')
enter = str(f.readlines()).rstrip('\n)
print(enter)
y = enter[0]
hp = enter[1]
coins = enter[2]
status = enter[3]
y2 = enter[4]
y3 = enter[5]
energy = enter[6]
stamina = enter[7]
item1 = enter[8]
item2 = enter[9]
item3 = enter[10]
equipped = enter[11]
firstime = enter[12]
armorpoint1 = enter[13]
armorpoint2 = enter[14]
armorpoints = enter[15]
upgradepoint1 = enter[16]
upgradepoint2 = enter[17]
firstime3 = enter[18]
firstime4 = enter[19]
part2 = enter[20]
receptionist = enter[21]
unlocklist = enter[22]
armorlist = enter[23]
heal1 = enter[24]
heal2 = enter[25]
heal3 = enter[26]
unlocked = enter[27]
unlocked2 = enter[28]
float(int(y))
int(hp)
int(coins)
str(status)
float(int(y2))
float(int(y3))
int(energy)
int(stamina)
str(item1)
str(item2)
str(item3)
str(equipped)
int(firstime)
int(armorpoint1)
int(armorpoint2)
int(armorpoints)
int(upgradepoint1)
int(upgradepoint2)
int(firstime3)
int(firstime4)
list(unlocklist)
list(armorlist)
int(heal1)
int(heal2)
int(heal3)
f.close()
SAMPLE FILE:
35.0
110
140
Sharpshooter
31.5
33
11
13
Slimer Gun
empty
empty
Protective Clothes
0
3
15
0
3
15
0
1
False
False
['Slime Slicer', 'Slimer Gun']
['Casual Clothes', 'Protective clothes']
4
3
-1
{'Protective Clothes': True}
{'Slimer Gun': True}
The .readlines() function returns a list, each item containing a separate line. In order to strip the newline from each of the lines, you can use a list comprehension:
f = open("data.txt", "r")
lines = [line.strip() for line in f.readlines()]
You can then proceed to cast each item in the list separately, as you have, or try to somehow automatically infer the type in a loop. This would be easier if you formatted the example file more like a configuration file. This thread has some relevant answers:
Best way to retrieve variable values from a text file?
I think its better if you read the file this way, which first reads and removes whitespace and then splits into lines. Then you can set a variable to each line (also you need to set the result of changing variable types to the variable).
For lists, you may need a function to extract the list from the string. However if you aren't expecting security breaches then using eval() should be fine.
def load_data():
f = open(name+".txt",'r')
content = f.read().rstrip()
lines = content.split("\n")
y = float(int(enter[0]))
hp = int(enter[1])
coins = int(enter[2])
status = enter[3]
# (etc)
unlocklist = eval(enter[22])
armorlist = eval(enter[23])
f.close()

String from file to string utf-8 in python

So I am reading and manipulate a file with :
base_file = open(path+'/'+base_name, "r")
lines = base_file.readlines()
After this I search and find the "raw_data" start of line.
if re.match("\s{0,100}raw_data: ",line):
split_line = line.split("raw_data:")
print(split_line)
raw_string = split_line[1]
One example of raw_data is:
raw_data: "&\276!\300\307 =\277\"O\271\277vH9?j?\345?#\243\264=\350\034\345\277\260\345\033\300\023\017(#z|\273\277L\}\277\210\\031\300\213\263z\277\302\241\033\300\000\207\323\277\247Oh>j\354\215#\364\305\201\276\361+\202#t:\304\277\344\231\243#\225k\002\300vw\262\277\362\220j\300\"(\337\276\354b8\300\230\347H\300\201\320\204\300S;N\300Z0G\300>j\210\000#\034\014\220#\231\330J#\223\025\236#\006\332\230\276\227\273\n\277\353#,#\202\205\215\277\340\356\022\300/\223\035\277\331\277\362\276a\350\013#)\353\276\277v6\316\277K\326\207\300`2)\300\004\014Q\300\340\267\271\300MV\305\300\327\010\207\300j\346o\300\377\260\216\300[\332g\300\336\266\003\300\320S\272?6\300Y#\356\250\034\300\367\277&\300\335Uq>o\010&\300r\277\252\300U\314\243\300\253d\377\300"
And raw_string will be
print(raw_data)
"&\276!\300\307
=\277\"O\271\277vH9?j?\345?#\243\264=\350\034\345\277\260\345\033\300\023\017(#z|\273\277L\}\277\210\\031\300\213\263z\277\302\241\033\300\000\207\323\277\247Oh>j\354\215#\364\305\201\276\361+\202#t:\304\277\344\231\243#\225k\002\300vw\262\277\362\220j\300\"(\337\276\354b8\300\230\347H\300\201\320\204\300S;N\300Z0G\300>j\210\000#\034\014\220#\231\330J#\223\025\236#\006\332\230\276\227\273\n\277\353#,#\202\205\215\277\340\356\022\300/\223\035\277\331\277\362\276a\350\013#)\353\276\277v6\316\277K\326\207\300`2)\300\004\014Q\300\340\267\271\300MV\305\300\327\010\207\300j\346o\300\377\260\216\300[\332g\300\336\266\003\300\320S\272?6\300Y#\356\250\034\300\367\277&\300\335Uq>o\010&\300r\277\252\300U\314\243\300\253d\377\300"
If I tried to read this file I will obtain one char to one char even for escape characters.
So, my question is how to transform this plain text to utf-8 string so that I can have one character when reading \300 and not 4 characters.
I tried to pass "encondig =utf-8" in open file method but does not work.
I have made the same example passing raw_data as variable and it works properly.
RAW_DATA = "&\276!\300\307 =\277\"O\271\277vH9?j?\345?#\243\264=\350\034\345\277\260\345\033\300\023\017(#z|\273\277L\\}\277\210\\\031\300\213\263z\277\302\241\033\300\000\207\323\277\247Oh>j\354\215#\364\305\201\276\361+\202#t:\304\277\344\231\243#\225k\002\300vw\262\277\362\220j\300\"(\337\276\354b8\300\230\347H\300\201\320\204\300S;N\300Z0G\300<I>>j\210\000#\034\014\220#\231\330J#\223\025\236#\006\332\230\276\227\273\n\277\353#,#\202\205\215\277\340\356\022\300/\223\035\277\331\277\362\276a\350\013#)\353\276\277v6\316\277K\326\207\300`2)\300\004\014Q\300\340\267\271\300MV\305\300\327\010\207\300j\346o\300\377\260\216\300[\332g\300\336\266\003\300\320S\272?6\300Y#\356\250\034\300\367\277&\300\335Uq>o\010&\300r\277\252\300U\314\243\300\253d\377\300"
print(f"Qnt -> {len(RAW_DATA)}") # Qnt -> 256
print(type(RAW_DATA))
at = 0
total = 0
while at < len(RAW_DATA):
fin = at+4
substrin = RAW_DATA[at:fin]
resu = FourString_float(substrin)
at = fin
For this example \300 is only one char.
Hope someone can help me.
The problem is that on the read file the escape \ symbols are coming in as \, but in the example you've provided they are being evaluated as part of the numerics that follow it. ie, \276 is read as a single character.
If you run:
RAW_DATA = r"&\276!\300\307 =\277\"O\271\277vH9?j?\345?#\243\264=\350\034\345\277\260\345\033\300\023\017(#z|\273\277L\\}\277\210\\\031\300\213\263z\277\302\241\033\300\000\207\323\277\247Oh>j\354\215#\364\305\201\276\361+\202#t:\304\277\344\231\243#\225k\002\300vw\262\277\362\220j\300\"(\337\276\354b8\300\230\347H\300\201\320\204\300S;N\300Z0G\300<I>>j\210\000#\034\014\220#\231\330J#\223\025\236#\006\332\230\276\227\273\n\277\353#,#\202\205\215\277\340\356\022\300/\223\035\277\331\277\362\276a\350\013#)\353\276\277v6\316\277K\326\207\300`2)\300\004\014Q\300\340\267\271\300MV\305\300\327\010\207\300j\346o\300\377\260\216\300[\332g\300\336\266\003\300\320S\272?6\300Y#\356\250\034\300\367\277&\300\335Uq>o\010&\300r\277\252\300U\314\243\300\253d\377\300"
print(f"Qnt -> {len(RAW_DATA)}") # Qnt -> 256
print(type(RAW_DATA))
at = 0
total = 0
while at < len(RAW_DATA):
fin = at+4
substrin = RAW_DATA[at:fin]
resu = FourString_float(substrin)
at = fin
You would should be getting the same error that you were getting originally. Notice that we are using the raw-string literal instead of regular string literal. This will ensure that the \ don't get escaped.
You would need to evaluate the RAW_DATA to force it to evaluate the \.
You can do something like RAW_DATA = eval(f'"{RAW_DATA}"') or
import ast
RAW_DATA = ast.literal_eval(f'"{RAW_DATA}"')
Note, the second option is a bit more secure that doing a straight eval as you are limiting the scope of what can be executed.

how to convert a string to the desired form?

Good day! I'm trying to implement a module for testing knowledge. The user is given the task, he wrote the decision that is being sent and executed on the server. The question in the following. There are raw data that is stored in the file. Example -
a = 5
b = 7
There is a custom solution that is stored in the string. example
s = a * b
p = a + b
print s,p
Now it is all written in a separate file as a string.
'a = 5\n', 'b = 7', u's = a * b\r\np = a + b\r\nprint s,p'
How to do this so that the code used can be performed. Will be something like that.
a = 5
b = 7
s = a * b
p = a + b
print s,p
Here's my function to create a solution and executes it if necessary.
def create_decision(user_decision, conditions):
f1 = open('temp_decision.py', 'w')
f = open(conditions, 'r+')
contents = f.readlines()
contents.append(user_decision)
f1.write(str(contents))
f1.close()
output = []
child_stdin, child_stdout, child_stderr = os.popen3("python temp_decision.py")
output = child_stdout.read()
return output
Or tell me what I'm doing wrong? Thanks!
You don't need to create a tempfile, you can simply use exec. create_decision would then look so:
A
def create_decision(user_decision, conditions):
f = open(conditions, 'r+')
contents = f.readlines()
contents.append(user_decision)
# join list entries with a newline between and return result as string
output = eval('\n'.join(contents))
return output
B
import sys
import StringIO
import contextlib
#contextlib.contextmanager
def stdoutIO(stdout=None):
old = sys.stdout
if stdout is None:
stdout = StringIO.StringIO()
sys.stdout = stdout
yield stdout
sys.stdout = old
def create_decision(user_decision, conditions):
f = open(conditions, 'r+')
contents = f.readlines()
contents.append(user_decision)
with stdoutIO() as output:
#exec('\n'.join(contents)) # Python 3
exec '\n'.join(contents) # Python 2
return output.getvalue()
You should also use str.join() to make one string out of the list of conditions. Otherwise you couldn't write it to a file or execute it (I've done this already in the above function).
Edit: There was a bug in my code (exec doesn't return anything) so I've added a method with eval (but that won't work with print, because it evaluates and returns the result of one expression, more info here). The second method captures the output of print and stdoutIO is from an other question/answer (here). This method returns the output from print, but is a bit more complicated.
You can execute a string like so: exec("code")

Should I be comparing bytes using struct?

I'm trying to compare the data within two files, and retrieve a list of offsets of where the differences are.
I tried it on some text files and it worked quite well..
However on non-text files (that still contain ascii text), I call them binary data files. (executables, so on..)
It seems to think some bytes are the same, even though when I look at it in hex editor, they are obviously not. I tried printing out this binary data that it thinks is the same and I get blank lines where it should be printed.
Thus, I think this is the source of the problem.
So what is the best way to compare bytes of data that could be both binary and contain ascii text? I thought using the struct module by be a starting point...
As you can see below, I compare the bytes with the == operator
Here's the code:
import os
import math
#file1 = 'file1.txt'
#file2 = 'file2.txt'
file1 = 'file1.exe'
file2 = 'file2.exe'
file1size = os.path.getsize(file1)
file2size = os.path.getsize(file2)
a = file1size - file2size
end = file1size #if they are both same size
if a > 0:
#file 2 is smallest
end = file2size
big = file1size
elif a < 0:
#file 1 is smallest
end = file1size
big = file2size
f1 = open(file1, 'rb')
f2 = open(file2, 'rb')
readSize = 500
r = readSize
off = 0
data = []
looking = False
d = open('data.txt', 'w')
while off < end:
f1.seek(off)
f2.seek(off)
b1, b2 = f1.read(r), f2.read(r)
same = b1 == b2
print ''
if same:
print 'Same at: '+str(off)
print 'readSize: '+str(r)
print b1
print b2
print ''
#save offsets of the section of "different" bytes
#data.append([diffOff, diffOff+off-1]) #[begin diff off, end diff off]
if looking:
d.write(str(diffOff)+" => "+str(diffOff+off-2)+"\n")
looking = False
r = readSize
off = off + 1
else:
off = off + r
else:
if r == 1:
looking = True
diffOff = off
off = off + 1 #continue reading 1 at a time, until u find a same reading
r = 1 #it will shoot back to the last off, since we didn't increment it here
d.close()
f1.close()
f2.close()
#add the diff ending portion to diff data offs, if 1 file is longer than the other
a = int(math.fabs(a)) #get abs val of diff
if a:
data.append([big-a, big-1])
print data
Did you try difflib and filecmp modules?
This module provides classes and
functions for comparing sequences. It
can be used for example, for comparing
files, and can produce difference
information in various formats,
including HTML and context and unified
diffs. For comparing directories and
files, see also, the filecmp module.
The filecmp module defines functions
to compare files and directories, with
various optional time/correctness
trade-offs. For comparing files, see
also the difflib module
.
You are probably encountering encoding/decoding problems. Someone may suggest a better solution, but you could try reading the file into a bytearray so you're reading raw bytes instead of decoded characters:
Here's a crude example:
$ od -Ax -tx1 /tmp/aa
000000 e0 b2 aa 0a
$ od -Ax -tx1 /tmp/bb
000000 e0 b2 bb 0a
$ cat /tmp/diff.py
a = bytearray(open('/tmp/aa', 'rb').read())
b = bytearray(open('/tmp/bb', 'rb').read())
print "%02x, %02x" % (a[2], a[3])
print "%02x, %02x" % (b[2], b[3])
$ python /tmp/diff.py
aa, 0a
bb, 0a

Categories