How to use binary files. Overwriting specific bytes - python

I'm writing a program in python, and would like to be able to write to specific bytes in a binary file. I tried to do this in the shell with a small binary file containing the numbers 0 through 15, but I can't figure out how to do so. Below is the code I just entered into the shell with comments to demonstrate what I am trying to do:
>>> File=open("TEST","wb") # Opens the file for writing.
>>> File.write(bytes(range(16))) # Writes the numbers 0 through 15 to the file.
16
>>> File.close() # Closes the file.
>>> File=open("TEST","rb") # Opens the file for reading, so that we can test that its contents are correct.
>>> tuple(File.read()) # Expected output: (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
>>> File.close() # Closes the file.
>>> File=open("TEST","wb") # Opens the file for writing.
>>> File.seek(3) # Moves the pointer to position 3. (Fourth byte.)
3
>>> File.write(b"\x01") # Writes 1 to the file in its current position.
1
>>> File.close() # Closes the file.
>>> File=open("TEST","rb") # Opens the file for reading, so that we can test that its contents are correct.
>>> tuple(File.read()) # Expected output: (0, 1, 2, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
(0, 0, 0, 1)
>>> File.close()
>>> File=open("TEST","wb") # I will try again using apend mode to overwrite.
>>> File.write(bytes(range(16)))
16
>>> File.close()
>>> File=open("TEST","ab") # Append mode.
>>> File.seek(3)
3
>>> File.write(b"\x01")
1
>>> File.close()
>>> File=open("TEST","rb")
>>> tuple(File.read()) # Expected output: (0, 1, 2, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1)
>>> File.close()
My desired output is as shown, but "wb" seems to erase all the data in the file, while "ab" can't seek backwards.
How would I achieve my desired output without rewriting the whole file?

When you open a file for writing with w, the file is truncated, all contents removed. You need to open the file for reading and writing with r+ instead. From the open() function documentation:
'w' open for writing, truncating the file first
and
For binary read-write access, the mode 'w+b' opens and truncates the file to 0 bytes. 'r+b' opens the file without truncation.
Because the file was truncated first, seeking to position 3 then writing \x01 has the first few bytes filled in with \x00 for you.
Opening a file in append mode usually restricts access to the new portion of the file only, so anything past the first 16 bytes. Again, from the documentation:
Other common values are [...] and 'a' for appending (which on some Unix systems, means that all writes append to the end of the file regardless of the current seek position).
(bold emphasis in quoted sections mine). This is why your \x01 byte ends up right at the end in spite of the File.seek(3) call.
r does not truncate a file and gives you full range of the contents with seek(); r+ adds write access to that mode. Demo with 'r+b':
>>> with open('demo.bin', 'wb') as f:
... f.write(bytes(range(16)))
...
16
>>> with open('demo.bin', 'rb') as f:
... print(*f.read())
...
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
>>> with open('demo.bin', 'r+b') as f: # read and write mode
... f.seek(3)
... f.write(b'\x01')
...
3
1
>>> with open('demo.bin', 'rb') as f:
... print(*f.read())
...
0 1 2 1 4 5 6 7 8 9 10 11 12 13 14 15

The solution is another mode: "r+b". (as shown by other answers.)
Here is the solution in the shell from where the file left off:
>>> File=open("TEST","r+b") # Opens file for reading and writing.
>>> File.seek(3)
3
>>> File.write(b"\x01")
1
>>> File.close()
>>> File=open("TEST","rb")
>>> tuple(File.read()) # Expected output: (0, 1, 2, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1)
(0, 1, 2, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1)
>>> File.close()

If I remember properly, you have to open the file in "Append mode" or it will simply erase everything and start from scratch, and then when you use seek(3) you just create those 3 0's and then you write the 1. I'll investigate further on how to write directly to a position but you may have to read the whole file, modify, write the whole file again.
You can actually read about this behaviour in the documentation:
'w' for only writing (an existing file with the same name will be erased)

Related

how to generate many lists and assign values to them

I want to read a specific number of lines from a list and assign all those values to a new list. Then I want to read the next bunch from the last_value+1 line from before for the exact same number of lines and assign those to a new list. So far I have this:
Let's say u = [1,2,3....,9,10,11,12,13...,19,20] and I want to assign the first 10 values from u into my newly generated list1 = [] => list1 = [1,2,..9,10]
then I want the next 10 values from u to be assigned to list2 so list2 = [11,12,13..,20]. The code so far is:
nLines = 10
nrepeats = 2
j=0
i=0
while (j<nrepeats):
### Generating empty lists ###
mklist = "list" + str(j) + " = []"
### do the segmentation ###
for i, uline in enumerate(u):
if i >= i and i < i+nLines:
mklist.append(uline)
j=j+1
Now the problem is, that i cant append to mklist because it's a string:
AttributeError: 'str' object has no attribute 'append'
How can I assign those values within that loop?
You could use a more suitable collection, for example, a dictionary:
nLines = 10
nrepeats = 2
j=0
i=0
my_dict = {}
while (j<nrepeats):
### Generating empty lists ###
my_dict[str(j)] = []
### do the segmentation ###
for i, uline in enumerate(u):
if i >= i and i < i+nLines:
my_dict[str(j)].append(uline)
j=j+1
You can use the zip function to group elements from iterables into groups of the same size. There are actually two ways, that differ in how you way to handle cases where you can't divide the source data cleanly
u = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
The first way is with regular zip and discards the leftover fragment
>>>list(zip(*[iter(u)]*10))
[(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)]
The second way uses itertools.zip_longest and pads out the last group with some fillvalue (default None)
>>>import itertools
>>>list(itertools.zip_longest(*[iter(u)]*10, fillvalue=None))
[(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), (11, 12, 13, 14, 15, 16, 17, 18, 19, 20), (21, None, None, None, None, None, None, None, None, None)]

Unable to understand why this Python FOR loop isn't working as intended

I'm trying to generate a random dataset to plot a graph in Python 2.7.
In which the 'y' list stores 14 integers between 100 and 135. I did that using the following code:
y = [random.randint(100, 135) for i in xrange(14)]
And for the 'x' list, I wanted to store the index values of the elements in 'y'. To achieve this, I tried using the code:
x = []
for i in y:
pt = y.index(i)
x.append(pt)
But when I run this, the result of the for loop ends up being:
x = [0, 1, 1, 3, 4, 5, 6, 3, 1, 9, 1, 11, 12, 13]
Why isn't the result the following?
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
Python is returning the first location of a value in y.
Running your code, here's an example y:
[127, 124, 105, 119, 121, 118, 130, 123, 122, 105, 110, 109, 108, 110]
110 is at both 10 and 13. 105 is at both 2 and 9. Python stops looking after it finds the first one, so x then becomes:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 2, 10, 11, 12, 10]
list.index() finds the index of the first occurrence of that object in the list, which produces the results you saw when the list has repeated elements. Something like [1, 1, 1].index(1) would produce 0.
If you want to generate a list of indices, you can use list(range(len(y))), which finds the length of y, creates a range() object out of it, then creates a list out of that object. Since you're using Python 2, you can omit the list() call, since Python 2's range() will already return a list.

How to format data into a python list

I am trying to figure out how to take data which looks like:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
from a file and make it look like:
[1, 2, 3, 4, 5,
6, 7, 8, 9, 10,
11, 12, 13, 14, 15,
16, 17, 18, 19, 20]
Each line represents a row from the data file, so the starting format should not be confused with:
1 2 3 4 5 6 7...
The original data is contained in a file so it must be first read in and then rewritten with the commas and brackets into a new file. My starting point would be to first read in the data:
with open("data.txt","r") as data:
lines = data.readlines()
Then I know that I have to take the read lines and rewrite them in the format I need but I don't know how to do this for each element of each line.
You can tell Python to split each line by its spaces, and then join the resulting elements with ', ' like this:
', '.join(line_in.split())
This will convert a string like:
'6 7 8 9 10'
in this:
'6, 7, 8, 9, 10'
Now, you need to decide whether this is the last line of the file or not. If it is the last line of the file you need to append a "]" whereas if it is not the last line, you need to add a ",".
At the beginning of the file you need also to add a "["
Hope it helps
You can try something like this:
>>> data = open('data.txt').read().split()
>>> data = [int(item) for item in data if item.isdigit()
>>> data
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
This will get all of the integers in the file. If you don't care, just remove the second, line, and call data = open('data.txt').read().split().

Python CSV transpose data in one column to rows

I have a CSV file containing data only in the first column,
I want to use python to transpose every 4 rows to another empty CSV file, for example, row 1 to row 4 transposed to the first row; then row 5 to row 8 transposed to the second row,...etc, and finally we can get a 5 * 4 matrix in the CSV file.
How to write a script to do this? Please give me any hint and suggestion, thank you.
I am using python 2.7.4 under Windows 8.1 x64.
update#1
I use the following code provided by thefortheye,
import sys, os
os.chdir('C:\Users\Heinz\Desktop')
print os.getcwd()
from itertools import islice
with open("test_csv.csv") as in_f, open("Output.csv", "w") as out_file:
for line in ([i.rstrip()] + map(str.rstrip, islice(in_f, 3)) for i in in_f):
out_file.write("\t".join(line) + "\n")
the input CSV file is,
and the result is,
This is not what I want.
You can use List comprehension like this
data = range(20)
print data
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
print[data[i:i + 4] for i in xrange(0, len(data), 4)]
# [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18,19]]
Instead of 4, you might want to use 56.
Since you are planing to read from the file, you might want to do something like this
from itertools import islice
with open("Input.txt") as in_file:
print [[int(line)] + map(int, islice(in_file, 3)) for line in in_file]
Edit As per the updated question,
from itertools import islice
with open("Input.txt") as in_f, open("Output.txt", "w") as out_file:
for line in ([i.rstrip()] + map(str.rstrip, islice(in_f, 3)) for i in in_f):
out_file.write("\t".join(line) + "\n")
Edit: Since you are looking for comma separated values, you can join the lines with ,, like this
out_file.write(",".join(line) + "\n")
You can use List comprehension and double-loop like this.
>>> M = 3
>>> N = 5
>>> a = range(M * N)
>>> o = [[a[i * N + j] for j in xrange(N)] for i in xrange(M)]
>>> print o
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]]

Sorting bytes in words, tuples in python

I looked around and I can't seem to find the proper way of sorting a 32 entry tuple by inverting every odd and even entry.
ex:
1 0 3 2 5 4 7 6 9 8
to
0 1 2 3 4 5 6 7 8 9
My current code looks like this
i=0
nd = []
while i < len(self.r.ipDeviceName):
print(i)
if i%2:
nd[i]=self.r.ipDeviceName[i-1]
else:
nd[i]=self.r.ipDeviceName[i+1]
dn = "".join(map(chr,nd))
devicenameText.SetValue(dn)
the type of self.r.ipDeviceName is tuple and I either get a IndexError or a tuple doesn't suport assignation depending on variations of the code
I also tried this with the same results
nd = self.r.ipDeviceName
for i in nd:
if i&0x01:
nd[i]=self.r.ipDeviceName[i-1]
else:
nd[i]=self.r.ipDeviceName[i+1]
dn = "".join(map(chr,nd))
devicenameText.SetValue(dn)
With the same results. Something very simple seems to elude me. Thanks for your help and time.
Tuples are immutable - you can't modify them once they are created. To modify individual elements you want to store the data in a mutable collection such as a list instead. You can use the built-in functions list and tuple to convert from tuple to list or vice versa.
Alternatively you could use zip and a functional style approach to create a new tuple from your existing tuple without modifying the original:
>>> t = tuple(range(10))
>>> tuple(x for i in zip(t[1::2], t[::2]) for x in i)
(1, 0, 3, 2, 5, 4, 7, 6, 9, 8)
Or using itertools.chain:
>>> import itertools
>>> tuple(itertools.chain(*zip(t[1::2], t[::2])))
(1, 0, 3, 2, 5, 4, 7, 6, 9, 8)
Note that the use of zip here assumes that your tuple has an even number of elements (which is the case here, according to your question).
You can't change a tuple, they're immutable. However you can replace them with a new one arranged the way you want (I wouldn't call what you want "sorted"). To do it, all that is needed it to swap each pair of items that are in the original tuple.
Here's a straight-forward implementation. Note it leaves the last entry alone if there are an odd number of them since you never said how you wanted that case handled. Dealing with that possibility complicates the code slightly.
def swap_even_odd_entries(seq):
tmp = list(seq)+[seq[-1]] # convert sequence to mutable list and dup last
for i in xrange(0, len(seq), 2):
tmp[i],tmp[i+1] = tmp[i+1],tmp[i] # swap each entry with following one
return tuple(tmp[:len(seq)]) # remove any excess
a = (1, 0, 3, 2, 5, 4, 7, 6, 9, 8)
a = swap_even_odd_entries(a)
b = (91, 70, 23, 42, 75, 14, 87, 36, 19, 80)
b = swap_even_odd_entries(b)
c = (1, 0, 3, 2, 5)
c = swap_even_odd_entries(c)
print a
print b
print c
# output
# (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
# (70, 91, 42, 23, 14, 75, 36, 87, 80, 19)
# (0, 1, 2, 3, 5)
The same thing can also be done in a less-readable way as a long single expression. Again the last entry remains unchanged if the length is odd.
swap_even_odd_entries2 = lambda t: tuple(
v for p in [(b,a) for a,b in zip(*[iter(t)]*2) + [(t[-1],)*2]]
for v in p)[:len(t)]
a = (1, 0, 3, 2, 5, 4, 7, 6, 9, 8)
a = swap_even_odd_entries2(a)
b = (91, 70, 23, 42, 75, 14, 87, 36, 19, 80)
b = swap_even_odd_entries2(b)
c = (1, 0, 3, 2, 5)
c = swap_even_odd_entries2(c)
print
print a
print b
print c
# output
# (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
# (70, 91, 42, 23, 14, 75, 36, 87, 80, 19)
# (0, 1, 2, 3, 5)
If you add the functions grouper and flatten (see itertools recipes) to your toolset, you can do:
xs = [1, 0, 3, 2, 5, 4, 7, 6, 9, 8]
xs2 = flatten((y, x) for (x, y) in grouper(2, xs))
# list(xs2) => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
You could even write flatten(imap(reversed, grouper(2, xs)) but I guess only die-hard functional guys would like it.

Categories