Python - Reading binary file with offsets and structs - python

I've recently gotten back into programming and decided as a project to get me going and motivated I was going to write a character editor for fallout 2. The issue I'm having is after the first few strings I can't seem to pull the data I need using the file offsets or structs.
This is what I am doing.
The file I Am working with is www.retro-gaming-world.com/SAVE.DAT
import struct
savefile = open('SAVE.DAT', 'rb')
try:
test = savefile.read()
finally:
savefile.close()
print 'Header: ' + test[0x00:0x18] # returns the save files header description "'FALLOUT SAVE FILE '"
print "Character Name: " + test[0x1D:0x20+4] Returns the characters name "f1nk"
print "Save game name: " + test[0x3D:0x1E+4] # isn't returning the save name "church" like expected
print "Experience: " + str(struct.unpack('>h', test[0x08:0x04])[0]) # is expected to return the current experience but gives the follosing error
output :
Header: FALLOUT SAVE FILE
Character Name: f1nk
Save game name:
Traceback (most recent call last):
File "test", line 11, in <module>
print "Experience: " + str(struct.unpack('>h', test[0x08:0x04])[0])
struct.error: unpack requires a string argument of length 2
I've confirmed the offsets but it just isn't returning anything as it is expected.

test[0x08:0x04] is an empty string because the end index is smaller than the starting index.
For example, test[0x08:0x0A] would give you two bytes as required by the h code.
The syntax for string slicing is s[start:end] or s[start:end:step]. Link to docs

Related

IndexError: List index out of range with glob(), rsplit()

I am trying to execute a python script which is giving me an IndexError. I understood that the rsplit() method failed to split the string. I don't exactly know why it is showing index out of range. Could anyone tell me how to solve this problem ?
code
raw_directory = 'results/'
for name in glob.glob(raw_directory + '*.x*'):
try:
#with open(name) as g:
# pass
print(name)
reaction_mechanism = 'gri30.xml' #'mech.cti'
gas = ct.Solution(reaction_mechanism)
f = ct.CounterflowDiffusionFlame(gas, width=1.)
name_only = name.rsplit('\\',1)[1] #delete directory in filename
file_name = name_only
f.restore(filename=raw_directory + file_name, name='diff1D', loglevel=0)
Output
If I delete the file strain_loop_07.xml, I got the same error with another file.
results/strain_loop_07.xml
Traceback (most recent call last):
File "code.py", line 38, in <module>
name_only = name.rsplit('\\'1)[1] #delete directory in filename
IndexError: list index out of range
If rsplit failed to split the string, it returns an array with only one solution, so the [0] and not [1]
I understood in reply of this post that "name" variable is filled with text like "result/strain_loop_07.xml", so you want to rsplit that, with a line more like
name_only = name.rsplit('/', 1)[1]
So you'll get the "strain_loop_07.xml" element, which is what you probably wanted, because name.resplit('/', 1) return something like
['result', 'strain_loop_07.xml']
By the way, don't hesitate to print your variable midway for debuging, that is often the thing to do, to understand the state of your variable at a specific timing. Here right before your split !

What is the python vesion of this?

How would you do this in python?
(It goes through a file and print the string in between author": ", " and text": ", \ and then print them to their files)
Here is an example string before it goes through this:
{"text": "Love this series!\ufeff", "time": "Hace 11 horas", "author": "HasRah", "cid": "UgyvXmvSiMjuDrOQn-l4AaABAg"}
#!/bin/bash
cat html.txt | awk -F 'author": "' {'print $2'} | cut -d '"' -f1 >> users.txt
cat html.txt | awk -F 'text": "' {'print $2'} | cut -d '\' -f1 >> comments.txt
I tried to do it like this in python (Didn't work):
import re
start = '"author": "'
end = '", '
st = open("html.txt", "r")
s = st.readlines()
u = re.search('%s(.*)%s' % (start, end), s).group(1)
#print u.group(1)
Not sure if I'm close.
I get this error code:
Traceback (most recent call last):
File "test.py", line 9, in <module>
u = re.search('%s(.*)%s' % (start, end), s).group(1)
File "/usr/lib/python2.7/re.py", line 146, in search
return _compile(pattern, flags).search(string)
TypeError: expected string or buffer`
Before getting into any of this: As chepner pointed out in a comment, this input looks like, and therefore is probably intended to be, JSON. Which means you shouldn't be parsing it with regular expressions; just parse it as JSON:
>>> s = ''' {"text": "Love this series!\ufeff", "time": "Hace 11 horas", "author": "HasRah", "cid": "UgyvXmvSiMjuDrOQn-l4AaABAg"}'''
>>> obj = json.loads(s)
>>> obj['author']
'HasRah'
Actually, it's not clear whether your input is a JSON file (a file containing one JSON text), or a JSONlines file (a file containing a bunch of lines, each of which is a JSON text with no embedded newlines).1
For the former, you want to parse it like this:
obj = json.load(st)
For the latter, you want to loop over the lines, and parse each one like this:
for line in st:
obj = json.loads(line)
… or, alternatively, you can get a JSONlines library off PyPI.
But meanwhile, if you want to understand what's wrong with your code:
The error message is telling you the problem, although maybe not in the user-friendliest way:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/re.py", line 148, in search
return _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object
See the docs for search make clear:
re.search(pattern, string, flags=0)
Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance…
You haven't passed it a string, you've passed it a list of strings. That's the whole point of readlines, after all.
There are two obvious fixes here.
First, you could read the whole file into a single string, instead of reading it into a list of strings:
s = st.read()
u = re.search('%s(.*)%s' % (start, end), s).group(1)
Alternatively, you could loop over the lines, trying to match each one. And, if you do this, you still don't need readlines(), because a file is already an iterable of lines:
for line in st:
u = re.search('%s(.*)%s' % (start, end), line).group(1)
While we're at it, if any of your lines don't match the pattern, this is going to raise an AttributeError. After all, search returns None if there's no match, but then you're going to try to call None.group(1).
There are two obvious fixes here as well.
You could handle that error:
try:
u = re.search('%s(.*)%s' % (start, end), line).group(1)
except AttributeError:
pass
… or you could check whether you got a match:
m = re.search('%s(.*)%s' % (start, end), line)
if m:
u = m.group(1)
1. In fact, there are at least two other formats that are nearly, but not quite, identical to JSONlines. I think that if you only care about reading, not creating files, and you don't have any numbers, you can parse all of them with a loop around json.loads or with a JSONlines library. But if you know who created the file, and know that they intended it to be, say, NDJ rather than JSONlines, you should read the docs on NDJ, or get a library made for NDJ, rather than just trusting that some guy on the internet thinks it's OK to treat it as JSONlines.

callig str when using functions

im writing some code to print a triangle with so many rows but when i try it it says,
how many rows in the triangle 5
Traceback (most recent call last):
File "U:\School\Homework\year 8\module 3\IT\python\lesson 10\extention task set by Mr Huckitns.py", line 6, in <module>
triangle(5)
File "U:\School\Homework\year 8\module 3\IT\python\lesson 10\extention task set by Mr Huckitns.py", line 5, in triangle
print((x*(str(" ")))(int(i)*(str("*")))((int(row)-int(i))*(str(" "))))
TypeError: 'str' object is not callable
anybodyknow whats going on here
the code i am using is
inttrow=int(input("how many rows in the triangle "))
def triangle(row):
for i in range(1,row):
x=int(inttrow)-int(i)
print((x*(str(" ")))(int(i)*(str("*")))((int(row)-int(i))*(str(" "))))
triangle(5)
The problem is the punctuation in your print statement. You're printing three strings in succession, but you forgot to put any concatenation operation between them. Try this:
print ((x*(str(" "))) + (int(i)*(str("*"))) + ((int(row)-int(i))*(str(" "))))
Further, why are you doing all these type coercions -- all of those variables already have the proper types. Cut it down to this:
print (x*" " + i*"*" + (row-i)*" ")
You are trying to contatenate strings by placing them near each other in the code like this:
("hello")(" ")("world")
Try that on the command line and see what happens. It is not the syntax of the language you are using. Then try using the plus sign.
"hello" + " " + "world"

Appending print commands with .decode() method

I have a script of about 300 lines (part of which is pasted below) with a lot of print commands. I am trying to cleanup the output it produces. If I leave it the way it is then all the print commands print bytes with \r\n on to the console.
I figured if I add .decode('utf-8') in front of the variable that I need to print then the output is what I should be expecting (uni-code string). For example, compare print (data1) and print (data3) commands below. What I want to do is to go through all of the code and append .decode() to every print statement.
All the print commands are in this format: Print (dataxxxx)
import telnetlib
import time
import sys
import random
from xlwt import Workbook
shelfIp = "10.10.10.10"
shelf = "33"
print ("Shelf IP is: " + str(shelfIp))
print ("Shelf number is: " + str(shelf))
def addCard():
tn = telnetlib.Telnet(shelfIp)
### Telnet session
tn.read_until(b"<",5)
cmd = "ACT-USER::ADMIN:ONE::ADMIN;"
tn.write(bytes(cmd,encoding="UTF-8"))
data1 = tn.read_until(b"ONE COMPLD", 5)
print (data1.decode('utf-8'))
### Entering second network element
cmd = "ENT-CARD::CARD" + shelf + "-" + shelf + ":TWO:xyz:;"
tn.write(bytes(cmd,encoding="UTF-8"))
data3 = tn.read_until(b"TWO COMPLD", 5)
print (data3)
### Entering third network element
cmd = "ENT-CARD::CARD-%s-%s:ADM:ABC:;" %(shelf,shelf)
tn.write(bytes(cmd,encoding="UTF-8"))
dataAmp = tn.read_until(b"ADM COMPLD", 5)
print (dataAmp)
tn.close()
addCard()
If you are looking into doing some sort of find-replace on the code, you can try this:
import re
f = open('script.py','rb')
script = f.read()
f.close()
newscript = re.sub("(print\(.*)\)", "\g<1>.decode('utf-8'))", script)
f = open('script.py', 'wb')
f.write(newscript)
f.close()
What I did in the regular expression:
Catch text that contains print(......) and save the print(..... part into group 1
Replace the text after the print(.... which is ) with: .decode('utf-8')) using the syntax \g<1> which takes the saved group number 1 and put that as the prefix in the replaced text.
Appending .decode() to print() statements will fail because .decode() is a string method.
>>> x=u"testing"
>>> print(x).decode('utf-8')
testing
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'decode'
You must apply .decode('utf-8') to the variables you wish to decode, which is not easily accomplished using regex based tools.

How can i write this function that mostly prints to a file?

So I posted about another part of this code yesterday but I've run into another problem. I made a character generator for an RPG and im trying to get the program the output of a character sheet function to a .txt file, but i think whats happening is that the function may return a Nonevalue for some of the stats (which is totally normal,) and then i get an error because of that when i try to write to a .txt file. I'm totally stumped, and help would be vastly appreciated!
# Character Sheet Function.
def char_shee():
print "Name:", name
print "Class:", character_class
print "Class Powers:", class_power
print "Alignment:", alignment
print "Power:", pow, pow_mod()
print "Intelligence:", iq, iq_mod()
print "Agility:", agi, agi_mod()
print "Constitution:", con, con_mod()
print "Cynicism:", cyn, cyn_mod()
print "Charisma:", cha, cha_mod()
print "All Characters Start With 3 Hit Dice"
print"""
\t\t{0}'s History
\t\t------------------
\t\tAge:{1}
\t\t{2}
\t\t{3}
\t\t{4}
\t\t{5}
\t\t{6}
\t\t{7}
\t\t{8}
\t\t{9}
\t\tGeneral Disposition: {10}
\t\tMost important thing is: {11}
\t\tWho is to blame for worlds problems: {12}
\t\tHow to solve the worlds problems: {13}
""".format(name, age, gender_id, ethnic_pr, fcd, wg, fogo_fuck, cur_fam,fam_fuk, nat_nur, gen_dis, wha_wor, who_pro, how_pro)
char_shee()
print "Press enter to continue"
raw_input()
# Export to text file?
print """Just because I like you, let me know if you want this character
saved to a text file. Please remember if you save your character not to
name it after something important, or you might lose it.
"""
text_file = raw_input("Please type 'y' or 'n', if you want a .txt file")
if text_file == "y":
filename = raw_input("\nWhat are we calling your file, include .txt")
target = open(filename, 'w')
target.write(char_shee()
target.close
print "\nOk I created your file."
print """
Thanks so much for using the Cyberpanky N.O.W Character Generator
By Ray Weiss
Goodbye
"""
else:
print """
Thanks so much for using the Cyberpanky N.O.W Character Generator
By Ray Weiss
Goodbye
"""
EDIT: Here is the output i get:
> Please type 'y' or 'n', if you want a .txt filey
>
> What are we calling your file, include .txt123.txt <function char_shee
> at 0x2ba470> Traceback (most recent call last): File "cncg.py", line
> 595, in <module>
> target.write(pprint(char_shee)) TypeError: must be string or read-only character buffer, not None
Using print writes to sys.stdout, it doesn't return a value.
You you want char_shee to return the character sheet string to write it to a file, you'll need to just build that string instead.
To ease building the string, use a list to collect your strings:
def char_shee():
sheet = []
sheet.append("Name: " + name)
sheet.append("Class: " + character_class)
# ... more appends ...
# Return the string with newlines
return '\n'.join(sheet)
you forgot parenthesis here:
target.write(char_shee())
target.close()
and as #Martijn Pieters pointed out you should return value from char_shee(), instead of printing them.

Categories