Related
I have three lists (really columns in a pandas dataframe) one with data of interest, one with x array coordinates, and one with y array coordinates. All lists are the same length and their order in the list associated with the coordinates (so L1: "Apple" coincides with L2:"1", and L3:"A"). I would like to make an array with the dimensions provided by the two coordinate lists with data from the data list. What is the best way to do this?
The expected output would be in the form of a numpy array or something like:
array = [[0,0,0,3,0,0,2,3][0,0,0,0,0,0,0,3]] #databased on below
Where in this example the array has the dimensions of y = 2 from y.unique() and x = 8 from x.unique().
The following is example input data for what I am talking about:
array_x
array_y
Data
1
a
0
2
a
0
3
a
0
4
a
3
5
a
0
6
a
0
7
a
2
8
a
3
1
b
0
2
b
0
3
b
0
4
b
0
5
b
0
6
b
0
7
b
0
8
b
3
You may be looking for pivot:
out = df.pivot(values=['Data'], columns=['array_y'], index=['array_x']).to_numpy()
Output:
array([[0, 0],
[0, 0],
[0, 0],
[3, 0],
[0, 0],
[0, 0],
[2, 0],
[3, 3]], dtype=int64)
Supposing you have a dataframe like that:
import pandas as pd
import numpy as np
myDataframe = pd.DataFrame([[1,2],[3,4],[5,6]], columns=['x','y'])
Then you can select the columns you want and creat an array from it
my_array = np.array(myDataframe[['x','y']])
>>> my_array
array([[1, 2],
[3, 4],
[5, 6]], dtype=int64)
You could do a zip (note: I'm shorthand-ing some of your example data):
data_x = [1, 2, 3, 4, 5, 6, 7, 8] * 2
data_y = ['a'] * 8 + ['b'] * 8
data_vals = [0,0,0,3,0,0,2,3,0,0,0,0,0,0,0,3]
coll = dict()
for (x, y, val) in zip(data_x, data_y, data_vals):
if coll.get(y) is None:
coll[y] = []
if x > len(coll[y]):
coll[y].extend([0] * (x - len(coll[y])))
coll[y][x - 1] = val
result = []
for k in sorted(coll):
result.append(coll[k])
print coll
print result
Output:
{'a': [0, 0, 0, 3, 0, 0, 2, 3], 'b': [0, 0, 0, 0, 0, 0, 0, 3]}
[[0, 0, 0, 3, 0, 0, 2, 3], [0, 0, 0, 0, 0, 0, 0, 3]]
Replace the value of 0 in matrix A with the value of the same position in matrix B, and the value of non-zero in A remains unchanged.
Pandas/numpy approaches are all acceptable.
A:
0 0 1
0 0 0
1 0 0
B:
0 2 4
2 0 3
4 3 0
The ideal result is:
C:
0 2 1
2 0 3
1 3 0
I need a concise way to handle a similar large matrix.
Assuming numpy
You can use numpy.where:
import numpy as np
A = np.array([[0, 0, 1],
[0, 0, 0],
[1, 0, 0]])
B = np.array([[0, 2, 4],
[2, 0, 3],
[4, 3, 0]])
C = np.where(A==1, A, B)
# OR
# C = np.where(A==0, B, A)
output:
array([[0, 2, 1],
[2, 0, 3],
[1, 3, 0]])
NB. I used A==1 to be explicit, but implicit 1/True equality makes it possible to do np.where(A, A, B)
Assuming pandas:
The approach is similar using where
dfA = pd.DataFrame(A)
dfB = pd.DataFrame(B)
dfC = dfA.where(dfA.ne(0), dfB)
# OR
# dfC = dfA.mask(dfA.eq(0), dfB)
output:
0 1 2
0 0 2 1
1 2 0 3
2 1 3 0
One possible solution could be:
import numpy as np
a = np.array([[0, 0, 1], [0, 0, 0], [1, 0, 0]])
b = np.array([[0, 2, 4], [2, 0, 3], [4, 3, 0]])
c = np.where(a == 0, b, a)
print(c)
Output:
[[0 2 1]
[2 0 3]
[1 3 0]]
If you have numpy 2d arrays, use numpy assignment:
A[A != 1] = B
A would be the desired output.
If you want a new matrix C:
C = A.copy()
C[C != 1] = B
I'm trying to code the 2048 game, and I'm stuck at the move up/down part.
So ex, if I've a list like this:
2 0 2 4
0 8 2 4
4 4 2 0
0 4 0 2
I want to move my numbers up, so I've something like this:
2 8 2 4
4 4 2 4
0 4 2 2
0 0 0 0
And I don't know how to even begin, someone can give me tips?
I tried this but it only works for a identity matrix:
new_matrix = []
for line in matrix:
for pos, element in enumerate(line):
if element != 0:
new_matrix.append(element)
line[pos] = 0
list.clear(matrix[0])
matrix[0] = new_matrix
return (matrix)
You can iterate through the list, check which index contains 0, let's say (n) then assign the value of the next elements nth index, and make next elements nth index 0:
A = [[1,0,2,1],[0,1,3,4],[4,5,1,0],[0,3,0,1]]
length = len(A)
for i, elem in enumerate(A):
for j, item in enumerate(elem):
if item == 0 and i + 1 < length:
A[i][j] = A[i+1][j]
A[i+1][j] = 0
print(A)
Prints:
[[1, 1, 2, 1], [4, 5, 3, 4], [0, 3, 1, 1], [0, 0, 0, 0]]
For the current example:
>>> A
[[2, 0, 2, 4],
[0, 8, 2, 4],
[4, 4, 2, 0],
[0, 4, 0, 2]]
>>> for i, elem in enumerate(A):
for j, item in enumerate(elem):
if item == 0 and i + 1 < length:
A[i][j] = A[i+1][j]
A[i+1][j] = 0
>>> A
[[2, 8, 2, 4],
[4, 4, 2, 4],
[0, 4, 2, 2],
[0, 0, 0, 0]]
I am trying to count the number of neighbours for each element in a 2d numpy array that differ from the element itself (4-neighbourhood in this case, but 8-neighbourhood is also interesting).
Something like this:
input labels:
[[1 1 1 2 2 2 2]
[1 1 1 2 2 2 2]
[1 1 1 2 2 2 2]
[1 1 3 3 3 5 5]
[4 4 4 3 3 5 5]
[4 4 4 3 3 5 5]] (6, 7)
count of unique neighbour labels:
[[0 0 1 1 0 0 0]
[0 0 1 1 0 0 0]
[0 0 2 2 1 1 1]
[1 2 2 1 2 2 1]
[1 1 1 1 1 1 0]
[0 0 1 1 1 1 0]] (6, 7)
I have the code below, and out of curiosity I am wondering if there is a better way to achieve this, perhaps without the for loops?
import numpy as np
import cv2
labels_image = np.array([
[1,1,1,2,2,2,2],
[1,1,1,2,2,2,2],
[1,1,1,2,2,2,2],
[1,1,3,3,3,5,5],
[4,4,4,3,3,5,5],
[4,4,4,3,3,5,5]])
print('input labels:\n', labels_image, labels_image.shape)
# Make a border, otherwise neighbours are counted as wrapped values from the other side
labels_image = cv2.copyMakeBorder(labels_image, 1, 1, 1, 1, cv2.BORDER_REPLICATE)
offsets = [(-1, 0), (0, -1), (0, 1), (1, 0)] # 4 neighbourhood
# Stack labels_image with one shifted per offset so we get a 3d array
# where each z-value corresponds to one of the neighbours
stacked = np.dstack(np.roll(np.roll(labels_image, i, axis=0), j, axis=1) for i, j in offsets)
# count number of unique neighbours, also take the border away again
labels_image = np.array([[(len(np.unique(stacked[i,j])) - 1)
for j in range(1, labels_image.shape[1] - 1)]
for i in range(1, labels_image.shape[0] - 1)])
print('count of unique neighbour labels:\n', labels_image, labels_image.shape)
I tried using np.unique with the return_counts and axis arguments, but could not get it to work.
Here's one approach -
import itertools
def count_nunique_neighbors(ar):
a = np.pad(ar, (1,1), mode='reflect')
c = a[1:-1,1:-1]
top = a[:-2,1:-1]
bottom = a[2:,1:-1]
left = a[1:-1,:-2]
right = a[1:-1,2:]
ineq = [top!= c,bottom!= c, left!= c, right!= c]
count = ineq[0].astype(int) + ineq[1] + ineq[2] + ineq[3]
blck = [top, bottom, left, right]
for i,j in list(itertools.combinations(range(4), r=2)):
count -= ((blck[i] == blck[j]) & ineq[j])
return count
Sample run -
In [22]: a
Out[22]:
array([[1, 1, 1, 2, 2, 2, 2],
[1, 1, 1, 2, 2, 2, 2],
[1, 1, 1, 2, 2, 2, 2],
[1, 1, 3, 3, 3, 5, 5],
[4, 4, 4, 3, 3, 5, 5],
[4, 4, 4, 3, 3, 5, 5]])
In [23]: count_nunique_neighbors(a)
Out[23]:
array([[0, 0, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 0, 0, 0],
[0, 0, 2, 2, 1, 1, 1],
[1, 2, 2, 1, 2, 2, 1],
[1, 1, 1, 1, 1, 1, 0],
[0, 0, 1, 1, 1, 1, 0]])
In the following I will give two examples that have different dimension values.
Lock-1
# numbers are the shown values on the so in this case: 0,1,2
numbers = 5
# fields are those things i can turn to change my combination
fields = 4
So what I would expect for all of my posibilities is
0 0 0 5
0 0 1 4
0 0 2 3
0 0 3 2
0 0 4 1
0 0 5 0
0 1 0 4
0 1 1 3
0 1 2 2
0 1 3 1
0 1 4 0
0 2 0 3
0 2 1 2
0 2 2 1
0 2 3 0
0 3 0 2
0 3 1 1
0 3 2 0
0 4 0 1
0 4 1 0
0 5 0 0
1 0 0 4
1 0 1 3
1 0 2 2
1 0 3 1
1 0 4 0
1 1 0 3
1 1 1 2
1 1 2 1
1 1 3 0
1 2 0 2
1 2 1 1
1 2 2 0
1 3 0 1
1 3 1 0
1 4 0 0
2 0 0 3
2 0 1 2
2 0 2 1
2 0 3 0
2 1 0 2
2 1 1 1
2 1 2 0
2 2 0 1
2 2 1 0
2 3 0 0
3 0 0 2
3 0 1 1
3 0 2 0
3 1 0 1
3 1 1 0
3 2 0 0
4 0 0 1
4 0 1 0
4 1 0 0
5 0 0 0
My second lock has the following values:
numbers = 3
values = 3
So what I would expect as my posibilities would look like this
0 0 3
0 1 2
0 2 1
0 3 0
1 0 2
1 1 1
1 2 0
2 0 1
2 1 0
3 0 0
I know this can be done with itertools.permutations and so on, but I want to generate the rows by building them and not by overloading my RAM. I figured out that the last 2 rows are always building up the same way.
So I wrote a funtion which builds it for me:
def posibilities(value):
all_pos = []
for y in range(value + 1):
posibility = []
posibility.append(y)
posibility.append(value)
all_pos.append(posibility)
value -= 1
return all_pos
Now I want some kind of way to fit the other values dynamically around my function, so e.g. Lock - 2 would now look like this:
0 posibilities(3)
1 posibilities(2)
2 posibilities(1)
3 posibilities(0)
I know I should use a while loops and so on, but I can't get the solution for dynamic values.
You could do this recursively, but it's generally best to avoid recursion in Python unless you really need it, eg, when processing recursive data structures (like trees). Recursion in standard Python (aka CPython) is not very efficient because it cannot do tail call elimination. Also, it applies a recursion limit (which is by default 1000 levels, but that can be modified by the user).
The sequences that you want to generate are known as weak compositions, and the Wikipedia article gives a simple algorithm which is easy to implement with the help of the standard itertools.combinations function.
#!/usr/bin/env python3
''' Generate the compositions of num of a given width
Algorithm from
https://en.wikipedia.org/wiki/Composition_%28combinatorics%29#Number_of_compositions
Written by PM 2Ring 2016.11.11
'''
from itertools import combinations
def compositions(num, width):
m = num + width - 1
last = (m,)
first = (-1,)
for t in combinations(range(m), width - 1):
yield [v - u - 1 for u, v in zip(first + t, t + last)]
# test
for t in compositions(5, 4):
print(t)
print('- ' * 20)
for t in compositions(3, 3):
print(t)
output
[0, 0, 0, 5]
[0, 0, 1, 4]
[0, 0, 2, 3]
[0, 0, 3, 2]
[0, 0, 4, 1]
[0, 0, 5, 0]
[0, 1, 0, 4]
[0, 1, 1, 3]
[0, 1, 2, 2]
[0, 1, 3, 1]
[0, 1, 4, 0]
[0, 2, 0, 3]
[0, 2, 1, 2]
[0, 2, 2, 1]
[0, 2, 3, 0]
[0, 3, 0, 2]
[0, 3, 1, 1]
[0, 3, 2, 0]
[0, 4, 0, 1]
[0, 4, 1, 0]
[0, 5, 0, 0]
[1, 0, 0, 4]
[1, 0, 1, 3]
[1, 0, 2, 2]
[1, 0, 3, 1]
[1, 0, 4, 0]
[1, 1, 0, 3]
[1, 1, 1, 2]
[1, 1, 2, 1]
[1, 1, 3, 0]
[1, 2, 0, 2]
[1, 2, 1, 1]
[1, 2, 2, 0]
[1, 3, 0, 1]
[1, 3, 1, 0]
[1, 4, 0, 0]
[2, 0, 0, 3]
[2, 0, 1, 2]
[2, 0, 2, 1]
[2, 0, 3, 0]
[2, 1, 0, 2]
[2, 1, 1, 1]
[2, 1, 2, 0]
[2, 2, 0, 1]
[2, 2, 1, 0]
[2, 3, 0, 0]
[3, 0, 0, 2]
[3, 0, 1, 1]
[3, 0, 2, 0]
[3, 1, 0, 1]
[3, 1, 1, 0]
[3, 2, 0, 0]
[4, 0, 0, 1]
[4, 0, 1, 0]
[4, 1, 0, 0]
[5, 0, 0, 0]
- - - - - - - - - - - - - - - - - - - -
[0, 0, 3]
[0, 1, 2]
[0, 2, 1]
[0, 3, 0]
[1, 0, 2]
[1, 1, 1]
[1, 2, 0]
[2, 0, 1]
[2, 1, 0]
[3, 0, 0]
FWIW, the above code can generate the 170544 sequences of compositions(15, 8) in around 1.6 seconds on my old 2GHz 32bit machine, running on Python 3.6 or Python 2.6. (The timing information was obtained by using the Bash time command).
FWIW, here's a recursive version taken from this answer by user3736966. I've modified it to use the same argument names as my code, to use lists instead of tuples, and to be compatible with Python 3.
def compositions(num, width, parent=[]):
if width > 1:
for i in range(num, -1, -1):
yield from compositions(i, width - 1, parent + [num - i])
else:
yield parent + [num]
Somewhat surprisingly, this one is a little faster than the original version, clocking in at around 1.5 seconds for compositions(15, 8).
If your version of Python doesn't understand yield from, you can do this:
def compositions(num, width, parent=[]):
if width > 1:
for i in range(num, -1, -1):
for t in compositions(i, width - 1, parent + [num - i]):
yield t
else:
yield parent + [num]
To generate the compositions in descending order, simply reverse the range call, i.e. for i in range(num + 1):.
Finally, here's an unreadable one-line version. :)
def c(n, w, p=[]):
yield from(t for i in range(n,-1,-1)for t in c(i,w-1,p+[n-i]))if w-1 else[p+[n]]
Being an inveterate tinkerer, I couldn't stop myself from making yet another version. :) This is simply the original version combined with the code for combinations listed in the itertools docs. Of course, the real itertools.combinations is written in C so it runs faster than the roughly equivalent Python code shown in the docs.
def compositions(num, width):
r = width - 1
indices = list(range(r))
revrange = range(r-1, -1, -1)
first = [-1]
last = [num + r]
yield [0] * r + [num]
while True:
for i in revrange:
if indices[i] != i + num:
break
else:
return
indices[i] += 1
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
yield [v - u - 1 for u, v in zip(first + indices, indices + last)]
This version is about 50% slower than the original at doing compositions(15, 8): it takes around 2.3 seconds on my machine.