Remove coordinates from lists on Python - python

This is a follow-up question from my last question:
Python3 Numpy np.where Error.
I have 2 lists like these:
x = [None,[1, 15, 175, 20],
[150, 175, 18, 20],
[150, 175, 18],
[192, 150, 177],...]
y = [None,[12, 43, 55, 231],
[243, 334, 44, 12],
[656, 145, 138],
[12, 150, 177],
[150, 177, 188],...]
I want to remove the x values lower than 30 and y values that correspond to the removed x values. (For example, (x,y) = (1,12) in x[1] and y[1])
In order to do that, I got the corrected x list:
In : [[v2 for v2 in v1 if v2>=30] for v1 in x[1:]]
Out: [[175], [150, 175], [150, 175], [192, 150, 177]]
I also got the coordinates of the remaining x values:
In : [(i,j) for i,v1 in enumerate(x[1:]) for j,v2 in enumerate(v1) if v2<30]
Out: [(0, 0), (0, 1), (0, 3), (1, 2), (1, 3), (2, 2)]
Now I want to use these coordinates to remove items from y.
How can I implement this?

To get the corrected y values, I would recommend bypassing the coordinates entirely as a first approach. The reason is that you may end up with empty lists along the way, which will throw off the shape of the output of you don't keep special track of them. Also, removing elements is generally much more awkward than not including them in the first place.
It would be much easier to make a corrected version of y in the same way you corrected x:
y_corr = [[n for m, n in zip(row_x, row_y) if m >= 30] for row_x, row_y in zip(x, y)]
Here we just used zip to step along both sets of lists in the same way you did with one.
If you absolutely insist on using the coordinates, I would recommend just copying y entirely and removing the elements from the corrected copy. You have to go backwards in each row to avoid shifting the meaning of the coordinates (e.g. with reversed). You could use itertools.groupby to do the actual iteration for each row:
y_corr = [row.copy() for row in y]
for r, group in groupby(reversed(coord), itemgetter(0)):
for c in map(itemgetter(1), group):
del y_corr[r][c]
Instead of reversing coord, you can reverse each group individually, e.g. with map(itemgetter(1), reversed(group)).
A better approach might be to compute the coordinates of the retained values instead of the discarded ones. I would recommend pre-allocating the output list, to help keep track of the empty lists and preserve the shape:
from itertools import groupby
from operator import itemgetter
coord = [(r, c) for r, row in enumerate(x) for c, n in enumerate(row) if n >= 30]
y_corr = [[]] * len(x)
for r, group in groupby(coord, itemgetter(0)):
y_corr[r] = [y[r][c] for c in map(itemgetter(1), group)]
If you don't care about preserving the empty rows, you can skip the loop and use a one-liner instead:
y_corr = [[y[r][c] for c in map(itemgetter(1), group)] for r, group in groupby(coord, itemgetter(0))]

new_y = []
for i in range(len(y)):
new_y.append([y[i][j] for j in range(len(y[i])) if (i,j) not in BadList])
where BadList is
[(i,j) for i,v1 in enumerate(x[1:]) for j,v2 in enumerate(v1) if v2<30]

You can get it using zip with
In [395]: [(a, b) for z in list(zip(x, y))[1:] for a, b in list(zip(*z)) if a >= 30]
Out[395]:
[(175, 55),
(150, 243),
(175, 334),
(150, 656),
(175, 145),
(192, 12),
(150, 150),
(177, 177)]
This is the equivalent of
In [396]: v = []
In [398]: for z in list(zip(x, y))[1:]:
...: for a, b in list(zip(*z)):
...: if a >= 30:
...: v.append((a,b))
...:
Where
In [388]: list(zip(x, y))[1:]
Out[388]:
[([1, 15, 175, 20], [12, 43, 55, 231]),
([150, 175, 18, 20], [243, 334, 44, 12]),
([150, 175, 18], [656, 145, 138]),
([192, 150, 177], [12, 150, 177])]
and
In [392]: list(zip(*list(zip(x, y))[1]))
Out[392]: [(1, 12), (15, 43), (175, 55), (20, 231)]

Related

How to turn a list of lists into pairs of points?

For example if I have:
a = [[[1, 1, 10], [148, 191, 203]],
[[133, 100], [19, 34]],
[[230, 200], [44, 68]]]
I would like to turn "a" it into:
[(1,148), (1,191), (10,203), (133,19), (100,34), (230,44), (200,68)]
Basically within each inner list I have a list of x values and a list of y values and I would like to pair them together. So a[0][0][0] and a[0][1][0] would be a pair. Is there a simple way that I would be able to do this? Thanks!
You can use zip to put together each pair of lists into a list of tuples:
a = [[[1, 1, 10], [148, 191, 203]],
[[133, 100], [19, 34]],
[[230, 200], [44, 68]]]
print([z for x, y in a for z in zip(x, y)])
Output:
[(1, 148), (1, 191), (10, 203), (133, 19), (100, 34), (230, 44), (200, 68)]

Finding calculation value in python list

I have a python list like this.
[100, 150, 30, 50, 10, 20, 40]
Then I want to find 90 in list above (yes of course not in there), but my expected result are:
[50, 40]
or
[50, 30, 10]
How can I achieve that in python or other programming language?
You can use a list comprehension to iterate over all the combinations of the list elements that sum to your target value
>>> from itertools import combinations
>>> data = [100, 150, 30, 50, 10, 20, 40]
>>> target = 90
>>> [comb for n in range(1, len(data)+1) for comb in combinations(data, n) if sum(comb) == target]
[(50, 40), (30, 50, 10), (30, 20, 40)]
for this, you need to check each combination of numbers. for example:
take 100 and check if its less than 90, if yes, then check it with every other number less than 90 and see if they add to 90, but, if both numbers add to less than 90, check with the other numbers to see if the again add to 90.
Try using a list-comprehension and set:
>>> set([x for x in lst for y in lst if x+y == 90 and x != y])
{40, 50}

how to split a list of numbers into sub lists based on values in python? [duplicate]

I've googled, I've tested, and this has me at my wits end. I have a list of numbers I need to group by similarity. For instance, in a list of [1, 6, 9, 100, 102, 105, 109, 134, 139], 1 6 9 would be put into a list, 100, 102, 105, and 109 would be put into a list, and 134 and 139. I'm terrible at math, and I've tried and tried this, but I can't get it to work. To be explicit as possible, I wish to group numbers that are within 10 values away from one another. Can anyone help? Thanks.
There are many ways to do cluster analysis. One simple approach is to look at the gap size between successive data elements:
def cluster(data, maxgap):
'''Arrange data into groups where successive elements
differ by no more than *maxgap*
>>> cluster([1, 6, 9, 100, 102, 105, 109, 134, 139], maxgap=10)
[[1, 6, 9], [100, 102, 105, 109], [134, 139]]
>>> cluster([1, 6, 9, 99, 100, 102, 105, 134, 139, 141], maxgap=10)
[[1, 6, 9], [99, 100, 102, 105], [134, 139, 141]]
'''
data.sort()
groups = [[data[0]]]
for x in data[1:]:
if abs(x - groups[-1][-1]) <= maxgap:
groups[-1].append(x)
else:
groups.append([x])
return groups
if __name__ == '__main__':
import doctest
print(doctest.testmod())
This will find the groups:
nums = [1, 6, 9, 100, 102, 105, 109, 134, 139]
for k, g in itertools.groupby(nums, key=lambda n: n//10):
print k, list(g)
0 [1, 6, 9]
10 [100, 102, 105, 109]
13 [134, 139]
Note that if nums isn't actually sorted as your sample shows, you'll need to sort it first.
First, you can easily convert any sequence into a sequence of pairs of adjacent items. Just tee it, shift it forward, and zip the unshifted and unshifted copies. The only trick is that you need to start with either (<something>, 1) or (139, <something>), because in this case we want not each pair of elements, but a pair for each element:
def pairify(it):
it0, it1 = itertools.tee(it, 2)
first = next(it0)
return zip(itertools.chain([first, first], it0), it1)
(This isn't the simplest way to write it, but I think this may be the way that's most readable to people who aren't familiar with itertools.)
>>> a = [1, 6, 9, 100, 102, 105, 109, 134, 139]
>>> list(pairify(a))
[(1, 1), (1, 6), (6, 9), (9, 100), (100, 102), (102, 105), (105, 109), (109, 134), (134, 139)]
Then, with a slightly more complicated version of Ned Batchelder's key, you can just use groupby.
However, I think in this case this will end up being more complicated than an explicit generator that does the same thing.
def cluster(sequence, maxgap):
batch = []
for prev, val in pairify(sequence):
if val - prev >= maxgap:
yield batch
batch = []
else:
batch.append(val)
if batch:
yield batch

How to convert columns of a two dimensional array into row in python?

I have a two-dimensional python array that looks like this:
A = [[186,192,133],[12],[122,193,154],[166,188,199],[133,44,23,56,78,96,100]]
Now how do I make a new array that looks like this?
B = [[186,12,122,166,133],[192, 193,188,44],[133,154,199,23],[56],[78],[96],[100]]
I basically want to convert the column of A into rows of B.
Solution
from itertools import izip_longest # in Python 3 zip_longest
list([x for x in y if x is not None] for y in izip_longest(*A))
result:
[[186, 12, 122, 166, 133],
[192, 193, 188, 44],
[133, 154, 199, 23],
[56],
[78],
[96],
[100]]
Explanation
izip_longest gives you an iterator:
>>> from itertools import izip_longest
>>> izip_longest([1, 2, 3], [4, 5])
<itertools.izip_longest at 0x103331890>
Convert it into a list to see what it does:
>>> list(izip_longest([1, 2, 3], [4, 5]))
[(1, 4), (2, 5), (3, None)]
It takes one element from each list and puts them pairwise into a tuple. Furthermore, it fills missing values with None ( or another value you supply).
The * allows to give a function an unspecified number of arguments. For example, we can put our two lists inside another list and use * and it still works the same:
>>> list(izip_longest(*[[1, 2, 3], [4, 5]]))
[(1, 4), (2, 5), (3, None)]
This is not limited to two arguments. An example with three.
Single arguments:
>>> list(izip_longest([1, 2, 3], [4, 5], [6]))
[(1, 4, 6), (2, 5, None), (3, None, None)]
All arguments in one list with *:
>>> list(izip_longest(*[[1, 2, 3], [4, 5], [6]]))
[(1, 4, 6), (2, 5, None), (3, None, None)]
You don't want the None values. Filter them out with a list comprehension:
>>> [x for x in y if x is not None]
For your A, you get this:
>>> list(izip_longest(*A))
[(186, 12, 122, 166, 133),
(192, None, 193, 188, 44),
(133, None, 154, 199, 23),
(None, None, None, None, 56),
(None, None, None, None, 78),
(None, None, None, None, 96),
(None, None, None, None, 100)]
Now, y runs through all entries in this list such as (186, 12, 122, 166, 133). While x runs through each individual number in y such as 186. The outer [] creates a list. So instead of the tuple (186, 12, 122, 166, 133)
we get a list [186, 12, 122, 166, 133]. Finally, the if x is not None filters out the None values.
Another method transposing using map and filter:
A = [[186, 192, 133], [12], [122, 193, 154], [166, 188, 199], [133, 44, 23, 56, 78, 96, 100]]
print([list(filter(None,sub)) for sub in map(None,*A)])
[[186, 12, 122, 166, 133], [192, 193, 188, 44], [133, 154, 199, 23], [56], [78], [96], [100]]
If 0 is a potential you will need to specifically check for None's:
print([list(filter(lambda x: x is not None, sub)) for sub in map(None,*A)])
Or map with a regular list comp as per Mikes answer:
[[x for x in sub if x is not None] for sub in map(None,*A)]
def rotate(A):
B = []
added = True
while added:
added = False
col = []
for row in A:
try:
col.append(row.pop(0))
except IndexError:
continue
added = True
col and B.append(col)
return B

How can I remove coordinates of points from lists if that are very close and and can be considered as a single point?

I have 2 lists of numbers that represent points coordinates in a cartesian diagram.
My aim is to consider many points in a range of 10 as 1 point.
First example:
first point (1,2)
second point (2,3)
third point (3,4)
4th point (80,90)
Coordinates list:
#(1)
x = [1,2,3, 80]
Y = [2,3,4, 90 ]
I would like to delete the nearest point in a range of 10 (both in x and y), we could consider the first three numbers as a single one.
And result is:
x = [1, 80] and y = [2, 90] or
x = [2,80] and y = [3, 90] or
x = [3,80] and y = [4, 90]
If coordinates lists are:
#(2)
x = [1,2,3, 80]
Y = [2,3,70, 90 ]
we could consider the first 2 numbers as one
Result is:
x = [1, 80] and y = [2, 90] or
x = [2,80] and y = [3, 90] or
If they are:
#(3)
x = [1,2, 75, 78 , 80, 101]
Y = [2,3, 81, 86, 90 , 91]
Result:
x = [1,75, 101] and y = [2,81, 91] or
x = [1,78, 101] and y = [2,86, 91] or
x = [1,80, 101] and y = [2,90, 91] or
x = [2,75, 101] and y = [3,81, 91] or
x = [2,78, 101] and y = [3,86, 91] or
x = [2,80, 101] and y = [3,90, 91] or
I need only 1 of this 6 solutions. It's not important if I have x = [1,75] or x = [1,78].
The important thing is having close number as only one.
Last example:
x = [ 95, 154, 161, 135, 138, 116]
y = [158, 166, 168, 170, 170, 171]
In this case, only 3 points remain.
171 - 170 = 1 => 138 - 116 = 22 both results are in the range of 25 i choose to delete 116 and 171
170 - 170 = 0 => 138 - 135 = 3 both result are in the range of 25 i delete 170 and 138
170 - 168 = 2 => 135 - 161 = 26 i cannot delete
168 - 166 = 2 => 161 - 154 = 7 i delete 168 and 161
166 - 158 = 8 => 154 - 95 = 59 i cannot delete
x = [95, 154, 161, 135]
Y = [158, 166, 168, 170]
I repeat the operation and and I delete 161 in x and 168 in y because:
168 - 166 = 2 => 161 - 154 = 7
x = [95, 154, 135]
Y = [158, 166, 170]
y list is in ascending order.
What is the fastest way to compare them?
In general, it's easier to filter lists than to delete from them in-place.
But to filter easily, you need a single list, not two of them.
That's exactly what zip is for.
I'm not sure I understand exactly what you're asking for, because from the description it sounds like everything should stay except for 161 / 168. I'll show the rule it sounds like you were describing.
xy = zip(x, y)
new_xy = ((a, b) for a, b in xy if abs(a-b) <= 10)
x, y = zip(*new_xy)
Whatever your actual goal is, just replace the if abs(a-b) <= 10 with the right rule for "if this pair of values should be kept", and you're done.
To understand how this works, you should try printing out xy (or, if you're using Python 3.x, list(xy)), and the other intermediate bits.
(If you're using Python 2.x, and your list are very big, you should probably import itertools and then use xy = itertools.izip(x, y) to avoid creating an extra list for no good reason. If you're using Python 3.x, this isn't a problem, because zip doesn't create extra lists anymore.)
From further comments, it seems like maybe you want to check x[i] against x[i-1], not against y[i], and in fact you don't look at the y values at all—it's just that if x[i] goes, so does y[i].
To simplify things, let's take y out of the equation entirely, and just deal with filtering x. We can come back to the y later.
There are two ways to approach this. The first is to break down and build an explicit loop, where we keep track of a last_value each time through the loop. The second is to get a list of adjacent pairs within x.
The answer is, once again, zip:
x_pairs = zip(x, x[1:])
The only problem is that this doesn't give you anything to compare the first element to. For example:
>>> x = [95, 154, 161, 135, 138, 116]
>>> list(zip(x, x[1:]))
[(95, 154), (154, 161), (161, 135), (135, 138), (138, 116)]
That tells you whether or not to keep 154, 161, 135, 138, and 116… but what about that 95? Well, you never explained the rule for that. If you want to compare it to 0, do zip([0]+x, x). If you want to always keep it… you can just compare it to itself, so zip(x[:1]+x, x). And so on. Whatever rule you want is pretty easy to write. I'll go with the "compare to 0" rule.
So, now we've got these pairs of adjacent values (95, 95) then (95, 154), and so on. In each case, if the distance between two adjacent values is <= 10, we want to keep the latter value. That's easy:
x_pairs = zip([0]+x, x)
new_x = [pair[1] for pair in x_pairs if abs(pair[0]-pair[1]) <= 10]
Putting the y in there is the same trick we used originally: just zip it to the pairs, and zip it back out afterward. To make things a bit simpler, instead of zipping the pairs and then zipping y on, we'll just zip it all up at once:
x_pairs_y = zip([0]+x, x, y)
new_xy = (xxy[1:] for xxy in x_pairs_y if abs(xxy[0]-xxy[1]) <= 10)
new_x, new_y = zip(*new_xy)
In some of your explanation, it sounds like you want to compare adjacent x values, and also compare adjacent y values, and filter them out if either different is >10.
If so, that's just more of the same:
xy_pairs = zip([0]+x, [0]+y, x, y)
new_xy = (xyxy[2:] for xyxy in xy_pairs
if abs(xyxy[0]-xyxy[2]) <= 10 and abs(xyxy[1]-xyxy[3]) <= 10)
new_x, new_y = zip(*new_xy)
However, when things get so complicated that your simple one-liners don't fit on one line, you should consider factoring things out a bit. For example, instead of having a list of x values and a list of y values, why not create a Point class?
class Point(object):
def __init__(self, x, y):
self.x, self.y = x, y
def long_diff(self, other):
return max(abs(self.x-other.x), abs(self.y-other.y))
points = (Point(x, y) for x, y in zip(x, y))
point_pairs = zip([Point(0, 0)]+points, points)
new_points = (pair[1] for pair in point_pairs if pair[0].long_diff(pair[1]) <= 10)
new_x, new_y = zip(*((point.x, point.y) for point in points))
It's a bit longer, but a lot easier to understand.
And it would be even easier to understand if you just used a list of Point objects in the first place, instead of separate lists of x and y values.
From further comments, it sounds like the condition you want is the exact opposite of the condition shown in my last two examples, and you don't know how to negate a condition.
First, not takes any expression and returns the opposite truth value. So:
new_xy = (xyxy[2:] for xyxy in xy_pairs
if not(abs(xyxy[0]-xyxy[2]) <= 10 and abs(xyxy[1]-xyxy[3]) <= 10))
Alternatively, "delete if the x distance is <= 10 and the y distance is <= 10" is the same as "keep if the x distance is > 10 or the y distance is > 10", right? So:
new_xy = (xyxy[2:] for xyxy in xy_pairs
if abs(xyxy[0]-xyxy[2]) > 10 or abs(xyxy[1]-xyxy[3]) > 10)
Also, if you're sure that your sequences are always monotonically increasing (that is, any element is always larger than the one before it), you don't really need the abs here (as long as you make sure to get the operand order right).
I was not sure why you "repeat the operation", or when to stop, so I have coded it up so the operation repeats until no more points are removed. Intermediate points are shown in order in a list as a two-element tuple of (x, y) coordinates rather than separate lists of x and y.
def outrange(pts, rnge):
partial = pts[::-1] # current points in reverse order
i, lastp = 0, []
while lastp != partial:
print('%i times around we have: %r' % (i, partial[::-1]))
i, lastp, partial = (i+1,
partial,
[(pn1x, pn1y)
for (pnx, pny), (pn1x, pn1y) in zip(partial[1:], partial)
if abs(pn1x - pnx) > rnge or abs(pn1y - pny) > rnge
] + partial[-1:])
return partial[::-1]
if __name__ == '__main__':
j = 0
for rnge, x, y in [(10, [1, 2, 3, 80] , [2, 3, 4, 90 ]),
(10, [1, 2, 3, 80] , [2, 3, 70, 90 ]),
(10, [1,2, 75, 78 , 80, 101], [2,3, 81, 86, 90 , 91]),
(25, [ 95, 154, 161, 135, 138, 116], [158, 166, 168, 170, 170, 171])]:
j += 1
print('\n## Example %i: Points outside range %s of:' % (j, rnge))
print(' x = %r\n y = %r' % (x, y))
pts = [(xx, yy) for xx, yy in zip(x,y)]
ans_x, ans_y = [list(z) for z in zip(*outrange(pts, rnge))]
print(' Answer: x = %r\n y = %r' % (ans_x, ans_y))
The output is:
## Example 1: Points outside range 10 of:
x = [1, 2, 3, 80]
y = [2, 3, 4, 90]
0 times around we have: [(1, 2), (2, 3), (3, 4), (80, 90)]
1 times around we have: [(1, 2), (80, 90)]
Answer: x = [1, 80]
y = [2, 90]
## Example 2: Points outside range 10 of:
x = [1, 2, 3, 80]
y = [2, 3, 70, 90]
0 times around we have: [(1, 2), (2, 3), (3, 70), (80, 90)]
1 times around we have: [(1, 2), (3, 70), (80, 90)]
Answer: x = [1, 3, 80]
y = [2, 70, 90]
## Example 3: Points outside range 10 of:
x = [1, 2, 75, 78, 80, 101]
y = [2, 3, 81, 86, 90, 91]
0 times around we have: [(1, 2), (2, 3), (75, 81), (78, 86), (80, 90), (101, 91)]
1 times around we have: [(1, 2), (75, 81), (101, 91)]
Answer: x = [1, 75, 101]
y = [2, 81, 91]
## Example 4: Points outside range 25 of:
x = [95, 154, 161, 135, 138, 116]
y = [158, 166, 168, 170, 170, 171]
0 times around we have: [(95, 158), (154, 166), (161, 168), (135, 170), (138, 170), (116, 171)]
1 times around we have: [(95, 158), (154, 166), (135, 170)]
2 times around we have: [(95, 158), (154, 166)]
Answer: x = [95, 154]
y = [158, 166]
The algorithm did what you did as noted in your explanation for example 4, but I questioned your logic in a comment to the main question as you say you drop a point then show it in the intermediate step.
If you calculated other results that are different to mine then please state why you get your differenc, i.e. what it is in your algorithm that I have not done, or look carefully at your answer for mistakes.

Categories