How to properly pad numbers in Python's struct - python

I have a tuple like this (note that first element can be of any size (big but not extremely big, ie. 2**12 - 1 is OK), and second will always be in range [0, 255]).
t = [(0, 137), (0, 80), (0, 78), (0, 71), (0, 13), ...]
I want to store these numbers as bytes on the file system (for compression). That means I also want to later use these bits to recover the tuple. Also note that it is a requirement that the Big endian is used.
for idx, v in compressed:
if v:
f.write(struct.pack(">I", idx))
f.write(struct.pack(">I", v))
However, when I try to get the numbers, like this:
with open(filepath, 'rb') as file:
data = file.read(4)
nums = []
while data:
num = struct.unpack(">I", data)[0]
print(num)
data = file.read(4)
nums.append(num)
I am not getting the numbers above (I am for some numbers, but later it gets messed up, probably because of bit padding).
How to stay consistent with bit padding? How can I add something with struct.pack('>I, ...) that I can later reliably get?
Update:
For the following tuple
[(0, 137), (0, 80), (0, 78), (0, 71), (0, 13), (0, 10), (0, 26), (6, 0), (0, 0), (9, 13), (0, 73), (0, 72), (0, 68), (0, 82), (9, 0), (0, 1), (0, 44), (15, 1), (17, 8), (0, 2), (15, 0), (0, 246) ...]
I get the following numbers using my approach:
[0, 137, 0, 80, 0, 78, 0, 71, 0, 13, 0, 10, 0, 26, 9, 13, 0, 73, 0, 72, 0, 68, 0, 82, 0, 1, 0, 44, 15, 1, 17, 8, 0, 2, 0, 246 ...]
See, at the (6,0) it starts to diverge. Until then it's fine. But it corrects itself?? at 9,13 and continues to do well.

Your code seems to work fine. However, you do have the following line in there:
if v:
v will be False for some of those tuples where the second element is 0 which won't be written to the file and therefore you won't see them when reading from that file, again.
Also, since you're writing your elements in pairs anyways, you could use >II as your format:
from struct import pack, unpack, calcsize
original = [(0, 137), (0, 80), (0, 78), (0, 71), (0, 13), (0, 10), (0, 26), (6, 0), (0, 0), (9, 13), (0, 73), (0, 72), (0, 68), (0, 82), (9, 0), (0, 1), (0, 44), (15, 1), (17, 8), (0, 2), (15, 0), (0, 246)]
filename = "test.txt"
fileformat = ">II"
with open(filename, "wb") as fp:
for element in original:
fp.write(pack(fileformat, *element))
with open(filename, "rb") as fp:
elements = iter(lambda: fp.read(calcsize(fileformat)), b"")
readback = [unpack(fileformat, element) for element in elements]
print(readback == original)

Given the following input:
compressed = [(0, 137), (0, 80), (0, 78), (0, 71), (0, 13), (0, 10), (0, 26), (6, 0), (0, 0), (9, 13), (0, 73), (0, 72), (0, 68), (0, 82), (9, 0), (0, 1), (0, 44), (15, 1), (17, 8), (0, 2), (15, 0), (0, 246)]
try this code for wrting data:
import struct
with open('file.dat', 'wb') as f:
for idx, v in compressed:
f.write(struct.pack(">I", idx))
f.write(struct.pack(">I", v))
and this code for reading:
with open('file.dat', 'rb') as f:
data = f.read(4)
nums = []
while data:
idx = struct.unpack(">I", data)[0]
data = f.read(4)
v = struct.unpack(">I", data)[0]
data = f.read(4)
nums.append((idx,v))
and nums contains:
[(0, 137), (0, 80), (0, 78), (0, 71), (0, 13), (0, 10), (0, 26), (6, 0), (0, 0), (9, 13), (0, 73), (0, 72), (0, 68), (0, 82), (9, 0), (0, 1), (0, 44), (15, 1), (17, 8), (0, 2), (15, 0), (0, 246)]
Which is the same as the input, in fact nums == compressed gives True.

Related

Python: Create route from Arcs

I am working on a capacitated vehicle routing problem and have found an optimal solution with the following set of arcs on my graph active:
[(0, 1),
(0, 4),
(0, 5),
(0, 6),
(0, 7),
(0, 10),
(1, 0),
(2, 13),
(3, 9),
(4, 12),
(5, 0),
(6, 14),
(7, 8),
(8, 0),
(9, 0),
(10, 11),
(11, 0),
(12, 3),
(13, 0),
(14, 2)]
the list is called arcsDelivery.
I would like to restructure this list to come to my found routes stored in the list routesdelivery:
[[0,1,0],[0,4,12,3,9,0],[0,5,0],[0,6,14,2,13,0],[0,7,8,0],[0,10,11,0]]
However, I have been struggeling to do so, anyone with some helpful tips?
Here is a way to do it (considering that the arcsdelivery list is sorted in ascending order based on the first element of each tuple):
def findTuple(elem):
for t in arcsDelivery:
if t[0]==elem:
return t
return None
arcsDelivery = [(0, 1),
(0, 4),
(0, 5),
(0, 6),
(0, 7),
(0, 10),
(1, 0),
(2, 13),
(3, 9),
(4, 12),
(5, 0),
(6, 14),
(7, 8),
(8, 0),
(9, 0),
(10, 11),
(11, 0),
(12, 3),
(13, 0),
(14, 2)]
routesDelivery = []
startRoutes = len(list(filter(lambda elem: elem[0]==0, arcsDelivery)))
for i in range(startRoutes):
tempList = []
currentTuple = arcsDelivery[i]
tempList.append(currentTuple[0])
tempList.append(currentTuple[1])
while True:
if currentTuple[1]==0:
break
else:
nextTuple = findTuple(currentTuple[1])
currentTuple = nextTuple
tempList.append(currentTuple[1])
routesDelivery.append(tempList)
print(routesDelivery)
Output:
[[0, 1, 0], [0, 4, 12, 3, 9, 0], [0, 5, 0], [0, 6, 14, 2, 13, 0], [0, 7, 8, 0], [0, 10, 11, 0]]

Incorrect matplotlib plot

I've been trying to make a simple plot on matplotlib with the following set of datapoints, but I'm getting an incorrect plot which is utterly baffling. The plot includes points that aren't in the set of datapoints.
The set of points I'm plotting are:
[(0, 0), (3, 0), (0, 0), (2, 0), (0, 0), (3, 0), (1, 0), (7, 0), (2, 0), (0, 0), (5, 0), (2, 1), (10, 1), (1, 0), (1, 0), (8, 0), (3, 0), (1, 0), (2, 0), (2, 0), (1, 0), (6, 1), (3, 0), (3, 0), (12, 1), (3, 0), (0, 0), (2, 0), (0, 0), (2, 0), (3, 1), (0, 0), (4, 0), (4, 0), (2, 0), (2, 0)]
And I'm simply calling:
plt.plot(pts, 'ro')
I'd love to know how I'm going wrong here. Thanks in advance.
Currently, matplotlib thinks that you're trying to plot each entry of the tuple against the index of the tuple. That is, your plot has the points (i, x_i) and (i, y_i) with 'i' going from 1 to 35.
As #jedwards pointed out, you could use the scatter function.
Or, you could make the plot function explicitly plot (x_i, y_i) by extracting each element of the tuple as follows:
import matplotlib.pyplot as plt
data = [(0, 0), (3, 0), (0, 0), (2, 0), (0, 0), (3, 0), (1, 0), (7, 0), (2, 0), (0, 0), (5, 0), (2, 1), (10, 1), (1, 0), (1, 0), (8, 0), (3, 0), (1, 0), (2, 0), (2, 0), (1, 0), (6, 1), (3, 0), (3, 0), (12, 1), (3, 0), (0, 0), (2, 0), (0, 0), (2, 0), (3, 1), (0, 0), (4, 0), (4, 0), (2, 0), (2, 0)]
plt.plot([int(i[0]) for i in data], [int(i[1]) for i in data], 'or')
plt.xlim(-1, 8) # Sets x-axis limits
plt.ylim(-1, 2) # Sets y-axis limits
plt.show() # Show the plot
"Set of points" makes me think you want a scatter plot instead. If you're expecting something like this:
Then you probably want pyplot's scatter() function.
import matplotlib.pyplot as plt
data = [(0, 0), (3, 0), (0, 0), (2, 0), (0, 0), (3, 0), (1, 0), (7, 0), (2, 0), (0, 0), (5, 0), (2, 1), (10, 1), (1, 0), (1, 0), (8, 0), (3, 0), (1, 0), (2, 0), (2, 0), (1, 0), (6, 1), (3, 0), (3, 0), (12, 1), (3, 0), (0, 0), (2, 0), (0, 0), (2, 0), (3, 1), (0, 0), (4, 0), (4, 0), (2, 0), (2, 0)]
x,y = zip(*data)
#plt.plot(data, 'ro') # is the same as
#plt.plot(x, 'ro') # this
plt.scatter(x, y) # but i think you want scatter
plt.show()
For plot() note:
If x and/or y is 2-dimensional, then the corresponding columns will be plotted.

Infinite path length of graph for networkx

I'm making a graph from an adj matrix, here is my code, I try to first make the graph, then put a weight with that graph as seen here
for element in elements:
senMatrix[int(element.matrix_row)-1,int(element.matrix_column)-1]=1
G=nx.from_numpy_matrix(senMatrix)
for element in elements:
G[int(element.matrix_row)-1][int(element.matrix_column)-1]['weight']=int(element.value)
print str(G.nodes())
print str(G.edges())
avgClusterNumber=nx.average_clustering(G,weight='weight')
clusterList=nx.clustering(G,weight='weight')
graphDiameter=nx.diameter(G)
The first two functions run without a problem, the last diameter function however throws an issue with infinite path length when which makes me think there are no edges or nodes or something.
Error seen here
networkx.exception.NetworkXError: Graph not connected: infinite path length
When I print them out I get this
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105]
[(0, 1), (0, 3), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (0, 11), (0, 12), (0, 13), (0, 14), (0, 17), (0, 18), (0, 19), (0, 20), (0, 23), (0, 25), (0, 26), (0, 27), (0, 28), (0, 32), (0, 33), (0, 35), (0, 36), (0, 37), (0, 38), (0, 39), (0, 43), (0, 46), (0, 47), (0, 48), (0, 50), (0, 52), (0, 55), (0, 56), (0, 57), (0, 59), (0, 60), (0, 61), (0, 63), (0, 65), (0, 66), (0, 69), (0, 70), (0, 71), (0, 72), (0, 77), (0, 79), (0, 80), (0, 82), (0, 85), (0, 86), (0, 88), (0, 90), (0, 92), (0, 96), (0, 97), (0, 98), (0, 99), (0, 100), (0, 101), (0, 104), (0, 105), (1, 2), (1, 3), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10), (1, 11), (1, 12), (1, 13), (1, 14), (1, 15), (1, 16), (1, 17), (1, 18), (1, 19), (1, 20), (1, 21), (1, 22), (1, 23), (1, 24), (1, 25), (1, 26), (1, 27), (1, 28), (1, 29), (1, 30), (1, 31), (1, 33), (1, 34), (1, 35), (1, 36), (1, 37), (1, 38), (1, 39), (1, 40), (1, 41), (1, 42), (1, 44), (1, 46), (1, 47), (1, 49), (1, 51), (1, 52), (1, 53), (1, 54), (1, 55), (1, 56), (1, 57), (1, 58), (1, 59), (1, 60), (1, 61), (1, 62), (1, 63), (1, 64), (1, 65), (1, 66), (1, 67), (1, 68), (1, 69), (1, 70), (1, 71), (1, 72), (1, 74), (1, 75), (1, 76), (1, 77), (1, 78), (1, 79), (1, 80), (1, 81), (1, 83), (1, 85), (1, 86), (1, 87), (1, 88), (1, 90), (1, 91), (1, 95), (1, 96), (1, 97), (1, 98), (1, 99), (1, 101), (1, 102), (1, 103), (1, 104), (1, 105), (2, 66), (2, 6), (2, 7), (2, 40), (2, 78), (2, 79), (2, 18), (2, 83), (2, 87), (2, 56), (2, 25), (2, 27), (2, 92), (2, 31), (3, 6), (3, 7), (3, 8), (3, 9), (3, 11), (3, 12), (3, 15), (3, 16), (3, 21), (3, 22), (3, 25), (3, 26), (3, 27), (3, 28), (3, 29), (3, 30), (3, 32), (3, 40), (3, 41), (3, 42), (3, 47), (3, 51), (3, 52), (3, 53), (3, 55), (3, 59), (3, 60), (3, 62), (3, 63), (3, 66), (3, 67), (3, 68), (3, 69), (3, 70),..
which prints out the nodes and then edges, obviously I've limited it to show the idea.
So I suppose my question is im not sure why this is an issue. Does anyhow know how to fix this?
I'm obviously not thinking about this in the correct light.
try
nx.is_connected(G)
if it returns False then you can separate the graph by components and find the diameter for each component, using
connected_components(G)

Change values of a list of list (of tuples) in a for statement

I want to create an object to store the positions of some creatures of a game.
A list of lists of tuples seemed appropriate to me. The matrix created by the list of lists represents the board of the game, element of it being a tuple of 2 variables ('type', number). For example: ('h', 3) would mean: 'there are 3 humans here'.
So here is how I initialize the board:
>>>lines = 5
>>>columns = 5
>>>board= [[(0,0)]*lines]*columns
>>>pprint(board)
[[(0, 0), (0, 0), (0, 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), (0, 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), (0, 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), (0, 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), (0, 0), (0, 0), (0, 0)]]
Then I want to put some humans in my board:
>>> board[2][2]=('h',3)
I expect the board to be:
[[(0, 0), (0, 0), (0, 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), (0, 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), ('h', 3), (0, 0), (0, 0)],
[(0, 0), (0, 0), (0, 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), (0, 0), (0, 0), (0, 0)]]
but instead, when I do >>> pprint(board), it returns:
[[(0, 0), (0, 0), ('h', 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), ('h', 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), ('h', 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), ('h', 0), (0, 0), (0, 0)],
[(0, 0), (0, 0), ('h', 0), (0, 0), (0, 0)]]
I don't understand why all the elements of board are modified, this is very very frustrating. I am certainly missing something here, thanks for your help.
The following:
board = [[(0,0)]*lines]*columns
should become
board = [[(0,0)]*lines for _ in range(columns)]
Otherwise the top-level list consists of references to the same sublist:
In [7]: lines = 3
In [8]: columns = 4
In [9]: board = [[(0,0)]*lines]*columns
In [10]: map(id, board)
Out[10]: [18422120, 18422120, 18422120, 18422120]
In this setup, when you change one sublist, they all change.

Python, need help parsing items from text file into list

I'm trying to parse items in a text file and store them into a list. The data looks something like this:
[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)]
[(10, 3, 1), (11, 3, 1), (12, 3, 1), (13, 3, 1), (13, 4, 1)]
[(10, 3, 5), (11, 3, 5), (12, 3, 5), (13, 3, 5), (13, 4, 5), (13, 5, 5), (13, 6, 5)]
[(6, 13, 5), (7, 13, 5), (8, 13, 5), (8, 14, 5), (7, 14, 5), (6, 14, 5), (6, 14, 6)]
I was able to strip the "[" and "]" but couldn't store the rest of information into list as such format:
(x, y, z). Any help?
def dataParser(fileName):
zoneList=[]; zone=[]
input=open(fileName,"r")
for line in input:
vals = line.strip("[")
newVals = vals.strip("]\n")
print newVals
v=newVals[0:9]
zone.append(v)
input.close()
return zone
In this particular case, you can use ast.literal_eval:
>>> with open("list.txt") as fp:
... data = [ast.literal_eval(line) for line in fp if line.strip()]
...
>>> data
[[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)], [(10, 3, 1), (11, 3, 1), (12, 3, 1), (13, 3, 1), (13, 4, 1)], [(10, 3, 5), (11, 3, 5), (12, 3, 5), (13, 3, 5), (13, 4, 5), (13, 5, 5), (13, 6, 5)], [(6, 13, 5), (7, 13, 5), (8, 13, 5), (8, 14, 5), (7, 14, 5), (6, 14, 5), (6, 14, 6)]]
It's the "safe" version of eval. It's not as general, though, for precisely that reason. If you're generating this input, you might want to look into a different way to save your data ("serialization"), whether using pickle or something like JSON -- there are lots of examples of using both you can find on SO and elsewhere.
You can do it without eval, using the string split method and the tuple constructor:
>>> st = "[(0,0,0), (1,0,0)]"
>>> splits = st.strip('[').strip(']\n').split(', ')
>>> splits
['(0,0,0)', '(1,0,0)']
>>> for split in splits:
... trimmed = split.strip('(').strip(')')
... tup = tuple(trimmed.split(','))
... print tup, type(tup)
...
('0', '0', '0') <type 'tuple'>
('1', '0', '0') <type 'tuple'>
>>>
From there, it's just appending to a list.
some might not like using eval() here, but you can do this in one line using it:
In [20]: lis=eval("[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)]")
In [23]: lis
Out[23]: [(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)]
using a text file:
In [44]: with open('data.txt') as f:
....: lis=[eval(x.strip()) for x in f]
....: print lis
....:
....:
[[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)], [(10, 3, 1), (11, 3, 1), (12, 3, 1), (13, 3, 1), (13, 4, 1)], [(10, 3, 5), (11, 3, 5), (12, 3, 5), (13, 3, 5), (13, 4, 5), (13, 5, 5), (13, 6, 5)], [(6, 13, 5), (7, 13, 5), (8, 13, 5), (8, 14, 5), (7, 14, 5), (6, 14, 5), (6, 14, 6)]]
The following is a bad idea if you're getting this data from any source that you don't trust completely, but if the data will always be in this format (and only contain numbers as the elements) something like this is quite straightforward:
collect = []
for line in input:
collect.append(eval(line))
The other answers work just fine and are a simple solution to this specific problem. But I am assuming that if you were having problems with string manipulation, then a simple eval() function won't help you out much the next time you have this problem.
As a general rule, the first thing you want to do when you are approached with a problem like this, is define your delimiters.
[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)]
Here you can see that "), (" is a potential delimiter between groups and a simple comma (",") is your delimiter between values. Next you want to see what you need to remove, and as you pointed out, brackets ("[" and "]") provide little information. We can also see that because we are dealing with numeric values, all spacing gives us little information and needs to be removed.
Building on this information, I set up your dataParser function in a way that returns the values you were looking for:
fileName= "../ZoneFinding/outputData/zoneFinding_tofu_rs1000.txt"
def dataParser(fileName):
with open(fileName,"r") as input
zoneLst = []
for line in input:
#First remove white space and the bracket+parenthese combos on the end
line = line.replace(" ","").replace("[(","").replace(")]","")
#Now lets split line by "),(" to create a list of strings with the values
lineLst = line.split("),(")
# At this point lineLst = ["0,0,0" , "1,0,0", "2,0,0", ...]
#Lastly, we will split each number by a "," and add each to a list
zone = [group.split(",") for group in lineLst]
zoneLst.append(zone)
return zoneLst
In the example above, all of the values are stored as strings. You could also replace the
definition of zone with the code below to store the values as floats.
zone = [ [float(val) for val in group.split(",")] for group in lineLst]

Categories