Error accessing binary data from a python list - python

I'm pretty new to python, using python 2.7. I have to read in a binary file, and then concatenate some of the bytes together. So I tried
f = open("filename", "rb")
j=0
infile = []
try:
byte = f.read(1)
while byte != "":
infile.append(byte)
byte = f.read(1)
finally:
f.close()
blerg = (bin(infile[8])<<8 | bin(infile[9]))
print type
where I realize that the recast as binary is probably unnecessary, but this is one of my later attempts.
The error I'm getting is TypeError: 'str' object cannot be interpreted as index.
This is news to me, since I'm not using a string anywhere. What the !##% am I doing wrong?
EDIT: Full traceback
file binaryExtractor.py, line 25, in
blerg = (bin(infile[8])<<8 | bin(infile[9]))
TypeError: 'str' object cannot be interpreted as index

You should be using struct whenever possible instead of writing your own code for this.
>>> struct.unpack('<H', '\x12\x34')
(13330,)

You want to use the ord function which returns an integer from a single character string, not bin which returns a string representation of a binary number.

Related

Python cannot convert _io.TextIOWrapper to float

I read a text file which contains numbers as columns without spaces between them as
mediaXoriginal = open('posx_mean_no_acoplo_tf_multiple.txt', 'r')
and I plot my results as
print(mediaXoriginal.read())
However, I need mediaXoriginal to be a float, since I want to interpolate the data inside. If I write
float(mediaXoriginal)
I get the following error.
float() argument must be a string or a number, not '_io.TextIOWrapper'
Can someone tell me how to convert mediaXoriginal to float?
In the end I managed to do it as
with open('posx_mean_no_acoplo_tf_multiple.txt', 'r') as f2:
data = f2.read()
print(data)

Read a txt-file as dict containing a numpy-array

I have a lot of .txt files that I want to read.
The .txt files were saved by converting a python dictionary to a string and saving the string in a .txt file.
param_string = str(parameters-as-dict)
text_file = open(parameter_file_path, "w")
text_file.write(param_string)
text_file.close()
The entries of the dict are of mixed types (float, int, string,...). In some of the files one entry of the dict is a numpy-array and is saved in the txt-file as
'epsilons': array([...])
Because I want to access the values saved in the dict by their names, I now want to read the txt-file and load them as a dict again. This works easily with
f = open(path, 'r')
parameters = ast.literal_eval(f.read())
as long as there is no numpy array in the file. When the numpy-array is present, I get the error:
File ".../python3.6/ast.py", line 84, in _convert
raise ValueError('malformed node or string: ' + repr(node)) ValueError: malformed node or string: <_ast.Call object at 0x7fb5428cc630>
Which makes sense, looking at the as.literal_eval documentation (https://docs.python.org/2/library/ast.html) which says
Safely evaluate an expression node or a Unicode or Latin-1 encoded
string containing a Python literal or container display. The string or
node provided may only consist of the following Python literal
structures: strings, numbers, tuples, lists, dicts, booleans, and
None.
Since I can't resave the file differently, I don't know at which position the array is and I want to avoid cumbersome regex parsing, I'm searching for a solution that transforms my txt-file into a dict containing a numpy-array.
EDIT: the problem is not only the numpy array but also when I saved an object of e.g. a specific class:
, 'foo' : <class bar>,
A solution, where everything that can not be parsed as some kind of number/bool/some knonw datatype is automatically saved as a string just as it is would satisfy my needs.
I suggest an iterative approach handling the exceptions as needed. I don't like using eval, perhaps there's a better way but this is quick and dirty and assumes you have safe inputs.
parameters = {}
with open("file.txt") as f:
for line in f:
(key, val) = line.split(':')
if val[:6] == '<class'
# string representation like '<class bar>'
# ast.literal_eval() can't handle this, and neither can eval()
# this is just a string literal, so keep it as such:
parameters[key] = val
continue
try:
parameters[key] = ast.literal_eval(val)
except ValueError as e:
# for unsupported data structures like np.array
parameters[key] = eval(val)
I guess you'll have to check for an array line by line. A quick & dirty suggestion:
parameters = {}
with open("file.txt") as f:
for line in f:
(key, val) = line.split(':')
if 'array' in val:
s = val.split('(', 1)[1].split(')')[0]
parameters[key] = np.array(ast.literal_eval(s))
else:
parameters[key] = ast.literal_eval(val)
Maybe for future reference, you can try using the pickle module to save your data.

Reading integer values from a file

I've been trying to develop a program that will take a list of unique words and a list of their indexes from a file to create the original text but I can't get the integers converted back from str.
file=open("compressed_file_words.txt", "r")
listofwords = file.read()
file=open("compressed_file_word_positions.txt", "r")
positions = file.read()
for i in positions:
reconstructed_text = reconstructed_text + listofwords[i] + " "
this fails with following error
TypeError: string indices must be integers
How do I get the str converted back to int? I have tried various methods but none seem to work
Try this:
for i in range(0, len(positions)):
reconstructed_text = reconstructed_text + listofwords[i] + " "
Your problem is that positions is actually not an integer, and therefore i won't be an integer. for will literally iterate through every element of position and name that element i for the purposes of the for block. Since your block assumes an integer but is getting a string (a line from the file), you need to make your iterator an integer tied to the length of positions.
Note that you are probably misusing read and are actually looking for readlines.

Python - Error when trying to extract bytes from file

I am currently trying to extract the raw binary bytes from a file e.g. 000001001000
f = open(r"file.z", "rb")
try:
byte = f.read();
print int(byte)
finally:
f.close()
The reason to why I used int(byte) was to have a peek at what the string looks like. (I couldn't print it because [Decode error - output not utf-8])
Traceback (most recent call last):
File "C:\Users\werdnakof\Downloads\test.py", line 9, in <module>
print int(byte);
ValueError: invalid literal for int() with base 10: '\x04\x80e\x06\xc0l\x06\xf0,\x02'
It returns \x04\x80e\x06\xc0l\x06\xf0,\x02
And I am not too sure where to go from here. I was told this is in 12 bit fixed with codes padded on the left.
Any advice or tips on how to solve this? All I want is the 12-bit number e.g.000001001000
Use encode and bin:
bin(int(b.encode("hex"),16))
In [27]: b='\x04\x80e\x06\xc0l\x06\xf0,\x02'
In [28]: int(b.encode("hex"),16)
Out[28]: 21257928890331299851266L
In [29]: bin(int(b.encode("hex"),16))
Out[29]: '0b10010000000011001010000011011000000011011000000011011110000001011000000001
with open("file.z","rb") as f:
for line in f:
print(int(line.encode("hex"), 16))
To print the contents of a binary string, you can convert it to hex-representation:
print byte.encode('hex')
For reading binary structures, you can use the struct-module.
Can you try this
f = open("file.z", "rb")
try:
byte = f.read();
print(bin(int(str(byte).encode("hex"),16)))
finally:
f.close()
From Padraic Cunningham's answer

Python write to file

I've got a little problem here.
I'm converting binary to ascii, in order to compress data.
All seems to work fine, but when I convert '11011011' to ascii and try to write it into file, I keep getting error
UnicodeEncodeError: 'charmap' codec can't encode character '\xdb' in position 0: character maps to
Here's my code:
byte = ""
handleR = open(self.getInput())
handleW = open(self.getOutput(), 'w')
file = handleR.readlines()
for line in file:
for a in range(0, len(line)):
chunk = result[ord(line[a])]
for b in chunk:
if (len(byte) < 8):
byte+=str(chunk[b])
else:
char = chr(eval('0b'+byte))
print(byte, char)
handleW.write(char)
byte = ""
handleR.close()
handleW.close()
Any help appreciated,
Thank You
I think you want:
handleR = open(self.getInput(), 'rb')
handleW = open(self.getOutput(), 'wb')
That will ensure you're reading and writing byte streams. Also, you can parse binary strings without eval:
char = chr(int(byte, 2))
And of course, it would be faster to use bit manipulation. Instead of appending to a string, you can use << (left shift) and | (bitwise or).
EDIT: For the actual writing, you can use:
handleW.write(bytes([char]))
This creates and writes a bytes from a list consisting of a single number.
EDIT 2: Correction, it should be:
handleW.write(bytes([int(byte, 2)]))
There is no need to use chr.

Categories