representing list of tuples as 2d array in python - python

Say I have a list of tuples containing the RGB information of each pixels in an image from left to right, top to bottom.
[(1,2,3),(2,3,4),(1,2,4),(9,2,1),(1,1,1),(3,4,5)]
Assuming the width and height of the image is already know, is there a way I can represent the image using list of list?
For example, let's say the above list of tuples represent a 2x3 image, image[1][2] should give me the RGB tuple (3,4,5).

Use the step argument in range (or xrange):
>>> width = 2
>>> pixels = [(1,2,3),(2,3,4),(1,2,4),(9,2,1),(1,1,1),(3,4,5)]
>>> image = [pixels[x:x+width] for x in range(0,len(pixels),width)]
>>> image
[[(1, 2, 3), (2, 3, 4)], [(1, 2, 4), (9, 2, 1)], [(1, 1, 1), (3, 4, 5)]]
It will make x increment by the value of the step, instead of the default, which is 1. If you are familiar with Java, it's similar to:
for (int x=0; x<length; x = x+step)

Related

Getting the correct max value from a list of tuples

My list of tuples look like this:
[(0, 0), (3, 0), (3, 3), (0, 3), (0, 0), (0, 6), (3, 6), (3, 9), (0, 9), (0, 6), (6, 0), (9, 0), (9, 3), (6, 3), (6, 0), (0, 3), (3, 3), (3, 6), (0, 6), (0, 3)]
It has the format of (X, Y) where I want to get the max and min of all Xs and Ys in this list.
It should be min(X)=0, max(X)=9, min(Y)=0, max(Y)=9
However, when I do this:
min(listoftuples)[0], max(listoftuples)[0]
min(listoftuples)[1], max(listoftuples)[1]
...for the Y values, the maximum value shown is 3 which is incorrect.
Why is that?
for the Y values, the maximum value shown is 3
because max(listoftuples) returns the tuple (9, 3), so max(listoftuples)[0] is 9 and max(listoftuples)[1] is 3.
By default, iterables are sorted/compared based on the values of the first index, then the value of the second index, and so on.
If you want to find the tuple with the maximum value in the second index, you need to use key function:
from operator import itemgetter
li = [(0, 0), (3, 0), ... ]
print(max(li, key=itemgetter(1)))
# or max(li, key=lambda t: t[1])
outputs
(3, 9)
Here is a simple way to do it using list comprehensions:
min([arr[i][0] for i in range(len(arr))])
max([arr[i][0] for i in range(len(arr))])
min([arr[i][1] for i in range(len(arr))])
max([arr[i][1] for i in range(len(arr))])
In this code, I have used a list comprehension to create a list of all X and all Y values and then found the min/max for each list. This produces your desired answer.
The first two lines are for the X values and the last two lines are for the Y values.
Tuples are ordered by their first value, then in case of a tie, by their second value (and so on). That means max(listoftuples) is (9, 3). See How does tuple comparison work in Python?
So to find the highest y-value, you have to look specifically at the second elements of the tuples. One way you could do that is by splitting the list into x-values and y-values, like this:
xs, ys = zip(*listoftuples)
Or if you find that confusing, you could use this instead, which is roughly equivalent:
xs, ys = ([t[i] for t in listoftuples] for i in range(2))
Then get each of their mins and maxes, like this:
x_min_max, y_min_max = [(min(L), max(L)) for L in (xs, ys)]
print(x_min_max, y_min_max) # -> (0, 9) (0, 9)
Another way is to use NumPy to treat listoftuples as a matrix.
import numpy as np
a = np.array(listoftuples)
x_min_max, y_min_max = [(min(column), max(column)) for column in a.T]
print(x_min_max, y_min_max) # -> (0, 9) (0, 9)
(There's probably a more idiomatic way to do this, but I'm not super familiar with NumPy.)

What does array[...,list([something]) mean?

I am going through the following lines of code but I didn't understand image[...,list()]. What do the three dots mean?
self.probability = 0.5
self.indices = list(permutations(range(3), 3))
if random.random() < self.probability:
image = np.asarray(image)
image = Image.fromarray(image[...,list(self.indices[random.randint(0, len(self.indices) - 1)])])
What exactly is happening in the above lines?
I have understood that the list() part is taking random channels from image? Am I correct?
It is an object in Python called Ellipsis (for example, as a placeholder for something missing).
x = np.random.rand(3,3,3,3,3)
elem = x[:, :, :, :, 0]
elem = x[..., 0] # same as above
This should be helpful if you want to access a specific element in a multi-dimensional array in NumPy.
list(permutations(range(3), 3)) generates all permutations of the intergers 0,1,2.
from itertools import permutations
list(permutations(range(3), 3))
# [(0, 1, 2), (0, 2, 1), (1, 0, 2), (1, 2, 0), (2, 0, 1), (2, 1, 0)]
So the following chooses among these tuples of permutations:
list(self.indices[random.randint(0, len(self.indices) - 1)])]
In any case you'll have a permutation over the last axis of image which is usually the image channels RGB (note that with the ellipsis (...) here image[...,ixs] we are taking full slices over all axes except for the last. So this is performing a shuffling of the image channels.
An example run -
indices = list(permutations(range(3), 3))
indices[np.random.randint(0, len(indices) - 1)]
# (2, 0, 1)
Here's an example, note that this does not change the shape, we are using integer array indexing to index on the last axis only:
a = np.random.randint(0,5,(5,5,3))
a[...,(0,2,1)].shape
# (5, 5, 3)

How to split a items inside a list, surrounded by brackets, seperated by , [duplicate]

This question already has answers here:
Unpacking a list / tuple of pairs into two lists / tuples [duplicate]
(2 answers)
Closed 3 years ago.
I have a string of plot ploints such as:
plots = [(0, 1), (0, 2), (1, 4), ... (0.4, 0.8) etc
But I want to split them into respective x and y's:
x = (0, 0, 1, ... 0.4)
y = (1, 2, 4, ... 0.8)
I am unsure how to do this.
python provides you with the zip() function
x, y = zip(*plots)
which would provide your exact need, the asterisk would unpack the list into tuples, than the zip function "zips" the tuples together, that is creates a tuple for each index of the tuples
plots = [(0, 1), (0, 2), (1, 4), (0.4, 0.8)]
x = [plot[0] for plot in plots]
y = [plot[1] for plot in plots]
This could be an inline solution 😃
plots = [(0, 1), (0, 2), (1, 4)]
x, y = [item[0] for item in plots], [item[1] for item in plots]

Dimensionality agnostic (generic) cartesian product [duplicate]

This question already has answers here:
How to get the cartesian product of multiple lists
(17 answers)
Closed 8 months ago.
I'm looking to generate the cartesian product of a relatively large number of arrays to span a high-dimensional grid. Because of the high dimensionality, it won't be possible to store the result of the cartesian product computation in memory; rather it will be written to hard disk. Because of this constraint, I need access to the intermediate results as they are generated. What I've been doing so far is this:
for x in xrange(0, 10):
for y in xrange(0, 10):
for z in xrange(0, 10):
writeToHdd(x,y,z)
which, apart from being very nasty, is not scalable (i.e. it would require me writing as many loops as dimensions). I have tried to use the solution proposed here, but that is a recursive solution, which therefore makes it quite hard to obtain the results on the fly as they are being generated. Is there any 'neat' way to do this other than having a hardcoded loop per dimension?
In plain Python, you can generate the Cartesian product of a collection of iterables using itertools.product.
>>> arrays = range(0, 2), range(4, 6), range(8, 10)
>>> list(itertools.product(*arrays))
[(0, 4, 8), (0, 4, 9), (0, 5, 8), (0, 5, 9), (1, 4, 8), (1, 4, 9), (1, 5, 8), (1, 5, 9)]
In Numpy, you can combine numpy.meshgrid (passing sparse=True to avoid expanding the product in memory) with numpy.ndindex:
>>> arrays = np.arange(0, 2), np.arange(4, 6), np.arange(8, 10)
>>> grid = np.meshgrid(*arrays, sparse=True)
>>> [tuple(g[i] for g in grid) for i in np.ndindex(grid[0].shape)]
[(0, 4, 8), (0, 4, 9), (1, 4, 8), (1, 4, 9), (0, 5, 8), (0, 5, 9), (1, 5, 8), (1, 5, 9)]
I think I figured out a nice way using a memory mapped file:
def carthesian_product_mmap(vectors, filename, mode='w+'):
'''
Vectors should be a tuple of `numpy.ndarray` vectors. You could
also make it more flexible, and include some error checking
'''
# Make a meshgrid with `copy=False` to create views
grids = np.meshgrid(*vectors, copy=False, indexing='ij')
# The shape for concatenating the grids from meshgrid
shape = grid[0].shape + (len(vectors),)
# Find the "highest" dtype neccesary
dtype = np.result_type(*vectors)
# Instantiate the memory mapped file
M = np.memmap(filename, dtype, mode, shape=shape)
# Fill the memmap with the grids
for i, grid in enumerate(grids):
M[...,i] = grid
# Make sure the data is written to disk (optional?)
M.flush()
# Reshape to put it in the right format for Carthesian product
return M.reshape((-1, len(vectors)))
But I wonder if you really need to store the whole Carthesian product (there's a lot of data duplication). Is it not an option to generate the rows in the product at the moment they're needed?
It seems you just want to loop over an arbitrary number of dimensions. My generic solution for this is using an index field and increment indices plus handling overflows.
Example:
n = 3 # number of dimensions
N = 1 # highest index value per dimension
idx = [0]*n
while True:
print(idx)
# increase first dimension
idx[0] += 1
# handle overflows
for i in range(0, n-1):
if idx[i] > N:
# reset this dimension and increase next higher dimension
idx[i] = 0
idx[i+1] += 1
if idx[-1] > N:
# overflow in the last dimension, we are finished
break
Gives:
[0, 0, 0]
[1, 0, 0]
[0, 1, 0]
[1, 1, 0]
[0, 0, 1]
[1, 0, 1]
[0, 1, 1]
[1, 1, 1]
Numpy has something similar inbuilt: ndenumerate.

List coordinates between a set of coordinates

This should be fairly easy, but I'm getting a headache from trying to figure it out. I want to list all the coordinates between two points. Like so:
1: (1,1)
2: (1,3)
In between: (1,2)
Or
1: (1,1)
2: (5,1)
In between: (2,1), (3,1), (4,1)
It does not need to work with diagonals.
You appear to be a beginning programmer. A general technique I find useful is to do the job yourself, on paper, then look at how you did it and translate that to a program. If you can't see how, break it down into simpler steps until you can.
Depending on how you want to handle the edge cases, this seems to work:
def points_between(p1, p2):
xs = range(p1[0] + 1, p2[0]) or [p1[0]]
ys = range(p1[1] + 1, p2[1]) or [p1[1]]
return [(x,y) for x in xs for y in ys]
print points_between((1,1), (5,1))
# [(2, 1), (3, 1), (4, 1)]
print points_between((5,6), (5,12))
# [(5, 7), (5, 8), (5, 9), (5, 10), (5, 11)]

Categories