Writing array as string (python 3) - python

I am trying to write entire array as text or csv file.
from array import array as pyarray
import csv
tmp1 = (x for x in range(10))
tmp2 = (x+10 for x in range(10))
arr1 = pyarray('l')
with open ('fileoutput','wb') as fil1:
for i in range(10):
val = next(tmp1) - next(tmp2)
arr1.append(val)
arr1.tofile(fil1)
The problem with this code is it writes as binary file. I want to write as string, so that it would be readable. It is possible to create a loop and write file line by line, however real problem has millions of line in arr1. What is optimized way to write in human readable form?
Edit:
After changing above code line to with open ('fileoutput','w') as fil1: i.e. 'wb' to 'w', there is error:
write() argument must be str, not bytes. So this is not solved the problem. Any suggestions?

You opened the file in wb mode. This writes in binary. Write to the file in w mode to write it as a string.
with open ('fileoutput','w') as fil1:

You can try appending the results to a string then save it into a file, as following:
from array import array as pyarray
tmp1 = (x for x in range(10))
tmp2 = (x+10 for x in range(10))
arr1 = pyarray('l')
fileoutput_str = str(arr1)+'\n'
for i in range(10):
val = next(tmp1) - next(tmp2)
fileoutput_str += str(val)+'\n'
fileoutput_fn = 'fileoutput'
fileoutput_fo = open(fileoutput_fn, 'w')
fileoutput_fo.write(fileoutput_str)
fileoutput_fo.close()
You will have to remove the binary option b in order to write string into the file.

Related

How to read through binary (.dat) file and return greatest int value python

I am having trouble working with the integers I loop through and print out in the binary file.
I have a main program that creates a binary file, writes x amount of random integers to the file, then closes the file.
*Throughout these code snippets, I import dump and load from pickle
from pickle import dump
from random import randint
output_file = open('file.dat', 'wb')
# 10 random integers
for i in range(10):
dump(randint(1, 100), output_file)
output_file.close()
I have created a program that will open this file, unpickle each integer and print them out. However, now I also want to work with these numbers: max, min, sum, etc. When I try to produce code that (I thought) would do this, I am getting:
33 Traceback (most recent call last):
File "binary_int_practice.py", line 13, in <module>
for i in load(input_file):
TypeError: 'int' object is not iterable
My code is below:
input_file = open('file.dat', 'rb')
print("Here are the integers:")
while True:
try:
i = load(input_file)
print(i, end=' ')
big = 0
for i in load(input_file):
if i > big:
big = i
print('The max number in the file is: ', big)
except EOFError:
input_file.close()
break
Can someone explain or help me understand where I am going wrong?
Thanks
load returns the next value read from the file; in your case, each value read is an int (just as you wrote them). It does not return an iterable that you can loop over.
So you'll have to get each number with its own call to load.
you have to use a list, fill it and add it to the file using "dump". because at each iteration the "randint" number changes in the file.
here is the code that works well
from pickle import dump
from random import randint
output_file = open('file.dat', 'wb')
# 10 random integers
data = []
for i in range(10):
data.append(randint(1, 100))
dump(data, output_file)
output_file.close()

Read int values stored in a file in python

I am writing a program to encrypt a file using RSA algo in python without using Crypto library. i have generated the keys and the e, n and d are stored in a .pem file. Now in another strict where the encrypting is taking place i am using the e, d and n values, but every time i am running the script an error is showing :
File "rsaencrypt.py", line 91, in <module>
main()
File "rsaencrypt.py", line 62, in main
encrypt = pow(content, e, n)
TypeError: unsupported operand type(s) for pow(): 'bytes','_io.TextIOWrapper', '_io.TextIOWrapper'
heres how i am opening the file in the encryption script and using pow() to encrypt the files:
n = open('nfile.pem', 'r')
c = open('cfile.pem', 'r')
d = open('dfile.pem', 'r'))
encrypt = pow(content, e, n)
I have searched the internet for how to read the int value from a file but i have found nothing.
Heres how i am saving the values in efile, dfile, and nfile:
#saving the values of n, d and e for further use
efile = open('efile.pem', 'w')
efile.write('%d' %(int(e)))
efile.close()
dfile = open('dfile.pem', 'w')
dfile.write('%d' %(int(d)))
dfile.close()
nfile = open('nfile.pem', 'w')
nfile.write('%d' % (int(n)))
nfile.close()
the values are stored like this: 564651648965132684135419864..............454
Now as want to encrypt the files i need to read the integer values written in the efile, dfile and nfile to use the values in the pow() as arguments.
Looking forward to suggestions. Thank you.
The open() function returns a file object, not the int. You need to convert returned object into int value by:
n = open('nfile.pem', 'r')
n_value = int(list(n)[0])
etc.
Another option (same result) is:
n = open('nfile.pem', 'r')
n_value = int(n.read())
The recommended way is to use with, this ensures your file is closed once you are done with it rather than waiting for garbage collection or explicitly calling f.close() to close your file.
n_results = []
with open('nfile.pem', 'r') as f:
for line in f:
#do something
try:
n.append(int(i))
except TypeError:
n.append(0) #you can replace 0 with any value to indicate a processing error
Also, utilize try-except block in case you have noise in your file that cannot be converted to integers. n_results return a list of all your values from your files which you can use to aggregate or combine them later for a single output.
This would be a better foundation as your project scales and if you deal with more data.

Writing Python arrays into txt file, one array per line

I would like to write a number of Python Arrays into a txt file, with one array per line. After which I would like to read the Arrays line by line.
My work in progress code below. The problem I am working on involves about 100,000 arrays (length of L)
from __future__ import division
from array import array
M = array('I',[1,2,3])
N = array('I',[10,20,30])
L = [M,N]
with open('manyArrays.txt','w') as file:
for a in L:
sA = a.tostring()
file.write(sA + '\n')
with open('manyArrays.txt','r') as file:
for line in file:
lineRead = array('I', [])
lineRead.fromstring(line)
print MRead
The error message I get is
lineRead.fromstring(line)
ValueError: string length not a multiple of item size
You can either use numpy function for this, or code lines yourself:
You could concatenate your arrays in one 2D array and save it directly with np.savetxt, load it with np.genfromtext :
M = np.array([1,2,3],dtype='I')
N = np.array([10,20,30],dtype='I')
data= np.array([M,N])
file='test.txt'
np.savetxt(file,data)
M2,N2 = np.genfromtxt(file)
Or do :
file2='test2.txt'
form="%i %i %i \n"
with open(file2,'w') as f:
for i in range(len(data)):
f.write(form % (data[i,0],data[i,1],data[i,2]))

Python - Efficient way to flip bytes in a file?

I've got a folder full of very large files that need to be byte flipped by a power of 4. So essentially, I need to read the files as a binary, adjust the sequence of bits, and then write a new binary file with the bits adjusted.
In essence, what I'm trying to do is read a hex string hexString that looks like this:
"00112233AABBCCDD"
And write a file that looks like this:
"33221100DDCCBBAA"
(i.e. every two characters is a byte, and I need to flip the bytes by a power of 4)
I am very new to python and coding in general, and the way I am currently accomplishing this task is extremely inefficient. My code currently looks like this:
import binascii
with open(myFile, 'rb') as f:
content = f.read()
hexString = str(binascii.hexlify(content))
flippedBytes = ""
inc = 0
while inc < len(hexString):
flippedBytes += file[inc + 6:inc + 8]
flippedBytes += file[inc + 4:inc + 6]
flippedBytes += file[inc + 2:inc + 4]
flippedBytes += file[inc:inc + 2]
inc += 8
..... write the flippedBytes to file, etc
The code I pasted above accurately accomplishes what I need (note, my actual code has a few extra lines of: "hexString.replace()" to remove unnecessary hex characters - but I've left those out to make the above easier to read). My ultimate problem is that it takes EXTREMELY long to run my code with larger files. Some of my files I need to flip are almost 2gb in size, and the code was going to take almost half a day to complete one single file. I've got dozens of files I need to run this on, so that timeframe simply isn't practical.
Is there a more efficient way to flip the HEX values in a file by a power of 4?
.... for what it's worth, there is a tool called WinHEX that can do this manually, and only takes a minute max to flip the whole file.... I was just hoping to automate this with python so we didn't have to manually use WinHEX each time
You want to convert your 4-byte integers from little-endian to big-endian, or vice-versa. You can use the struct module for that:
import struct
with open(myfile, 'rb') as infile, open(myoutput, 'wb') as of:
while True:
d = infile.read(4)
if not d:
break
le = struct.unpack('<I', d)
be = struct.pack('>I', *le)
of.write(be)
Here is a little struct awesomeness to get you started:
>>> import struct
>>> s = b'\x00\x11\x22\x33\xAA\xBB\xCC\xDD'
>>> a, b = struct.unpack('<II', s)
>>> s = struct.pack('>II', a, b)
>>> ''.join([format(x, '02x') for x in s])
'33221100ddccbbaa'
To do this at full speed for a large input, use struct.iter_unpack

Writing to a binary file python

I want to write something to a binary file using python.
I am simply doing:
import numpy as np
f = open('binary.file','wb')
i=4
j=5.55
f.write('i'+'j') #where do i specify that i is an integer and j is a double?
g = open('binary.file','rb')
first = np.fromfile(g,dtype=np.uint32,count = 1)
second = np.fromfile(g,dtype=np.float64,count = 1)
print first, second
The output is just:
[] []
I know it is very easy to do this in Matlab "fwrite(binary.file, i, 'int32');", but I want to do it in python.
You appear to be having some confusion about types in Python.
The expression 'i' + 'j' is adding two strings together. This results in the string ij, which is most likely written to the file as two bytes.
The variable i is already an int. You can write it to a file as a 4-byte integer in a couple of different ways (which also apply to the float j):
Use the struct module as detailed in how to write integer number in particular no of bytes in python ( file writing). Something like this:
import struct
with open('binary.file', 'wb') as f:
f.write(struct.pack("i", i))
You would use the 'd' specifier to write j.
Use the numpy module to do the writing for you, which is especially convenient since you are already using it to read the file. The method ndarray.tofile is made just for this purpose:
i = 4
j = 5.55
with open('binary.file', 'wb') as f:
np.array(i, dtype=np.uint32).tofile(f)
np.array(j, dtype=np.float64).tofile(f)
Note that in both cases I use open as a context manager when writing the file with a with block. This ensures that the file is closed, even if an error occurs during writing.
That's because you are trying to write a string(edited) into a binary file. You also don't close the file before trying to read it again.
If you want to write ints or strings to a binary file try adding the below code:
import numpy as np
import struct
f = open('binary.file','wb')
i = 4
if isinstance(i, int):
f.write(struct.pack('i', i)) # write an int
elif isinstance(i, str):
f.write(i) # write a string
else:
raise TypeError('Can only write str or int')
f.close()
g = open('binary.file','rb')
first = np.fromfile(g,dtype=np.uint32,count = 1)
second = np.fromfile(g,dtype=np.float64,count = 1)
print first, second
I'll leave it to you to figure out the floating number.
print first, second
[4] []
The more pythonic file handler way:
import numpy as np
import struct
with open ('binary.file','wb') as f:
i = 4
if isinstance(i, int):
f.write(struct.pack('i', i)) # write an int
elif isinstance(i, str):
f.write(i) # write a string
else:
raise TypeError('Can only write str or int')
with open('binary.file','rb') as g:
first = np.fromfile(g,dtype=np.uint32,count = 1)
second = np.fromfile(g,dtype=np.float64,count = 1)
print first, second

Categories