Related
I know that you can move an array with NumPy so if you use np.roll you can shift array to right or to the left. I was wondering how to move a specific set of values with in the array to either left right up or down.
for example
if I wanted to move what is circled in red to the left how would i be able to move that and nothing else?
numpy can use slice to get subarray and later assing it in different place
import numpy as np
x = [
[0, 1, 2, 1, 0, 0, 1, 2, 1, 0 ],
[0, 1, 2, 1, 0, 0, 1, 2, 1, 0 ]
]
arr = np.array(x)
print(arr)
subarr = arr[0:2,1:4] # get values
print(subarr)
arr[0:2,0:3] = subarr # put in new place
print(arr)
Result:
[[0 1 2 1 0 0 1 2 1 0]
[0 1 2 1 0 0 1 2 1 0]]
[[1 2 1]
[1 2 1]]
[[1 2 1 1 0 0 1 2 1 0]
[1 2 1 1 0 0 1 2 1 0]]
It keeps original values in [0][1], [1][1]. If you want remove them then you could copy subarray, set zero in original place, and put copy in new place
import numpy as np
x = [
[0, 1, 2, 1, 0, 0, 1, 2, 1, 0 ],
[0, 1, 2, 1, 0, 0, 1, 2, 1, 0 ]
]
arr = np.array(x)
print(arr)
subarr = arr[0:2,1:4].copy() # duplicate values
print(subarr)
arr[0:2,1:4] = 0 # remove original values
arr[0:2,0:3] = subarr # put in new place
print(arr)
Result
[[0 1 2 1 0 0 1 2 1 0]
[0 1 2 1 0 0 1 2 1 0]]
[[1 2 1]
[1 2 1]]
[[1 2 1 0 0 0 1 2 1 0]
[1 2 1 0 0 0 1 2 1 0]]
I have a dataframe of N columns. Each element in the dataframe is in the range 0, N-1.
For example, my dataframce can be something like (N=3):
A B C
0 0 2 0
1 1 0 1
2 2 2 0
3 2 0 0
4 0 0 0
I want to create a co-occurrence matrix (please correct me if there is a different standard name for that) of size N x N which each element ij contains the number of times that element i and j assume the same value.
A B C
A x 2 3
B 2 x 2
C 3 2 x
Where, for example, matrix[0, 1] means that A and B assume the same value 2 times.
I don't care about the value on the diagonal.
What is the smartest way to do that?
DataFrame.corr
We can define a custom callable function for calculating the correlation between the columns of the dataframe, this callable takes two 1D numpy arrays as its input arguments and return's the count of the number of times the elements in these two arrays equal to each other
df.corr(method=lambda x, y: (x==y).sum())
A B C
A 1.0 2.0 3.0
B 2.0 1.0 2.0
C 3.0 2.0 1.0
Let's try broadcasting across the transposition and summing axis 2:
import pandas as pd
df = pd.DataFrame({
'A': {0: 0, 1: 1, 2: 2, 3: 2, 4: 0},
'B': {0: 2, 1: 0, 2: 2, 3: 0, 4: 0},
'C': {0: 0, 1: 1, 2: 0, 3: 0, 4: 0}
})
vals = df.T.values
e = (vals[:, None] == vals).sum(axis=2)
new_df = pd.DataFrame(e, columns=df.columns, index=df.columns)
print(new_df)
e:
[[5 2 3]
[2 5 2]
[3 2 5]]
Turn back into a dataframe:
new_df = pd.DataFrame(e, columns=df.columns, index=df.columns)
new_df:
A B C
A 5 2 3
B 2 5 2
C 3 2 5
I don't know about the smartest way but I think this works:
import numpy as np
m = np.array([[0, 2, 0], [1, 0, 1], [2, 2, 0], [2, 0, 0], [0, 0, 0]])
n = 3
ans = np.zeros((n, n))
for i in range(n):
for j in range(i+1, n):
ans[i, j] = len(m) - np.count_nonzero(m[:, i] - m[:, j])
print(ans + ans.T)
You are given two integer numbers n and r, such that 1 <= r < n,
a two-dimensional array W of size n x n.
Each element of this array is either 0 or 1.
Your goal is to compute density map D for array W, using radius of r.
The output density map is also two-dimensional array,
where each value represent number of 1's in matrix W within the specified radius.
Given the following input array W of size 5 and radius 1 (n = 5, r = 1)
1 0 0 0 1
1 1 1 0 0
1 0 0 0 0
0 0 0 1 1
0 1 0 0 0
Output (using Python):
3 4 2 2 1
4 5 2 2 1
3 4 3 3 2
2 2 2 2 2
1 1 2 2 2
Logic: Input first row, first column value is 1. r value is 1. So we should check 1 right element, 1 left element, 1 top element, top left, top right, bottom , bottom left and bottom right and sum all elements.
Should not use any 3rd party library.
I did it using for loop and inner for loop and check for each element. Any better work around ?
Optimization: For each 1 in W, update count for locations, in whose neighborhood it belongs
Although for W of size nxn, the following algorithm would still take O(n^2) steps, however if W is sparse i.e. number of 1s (say k) << nxn then instead of rxrxnxn steps for approach stated in question, following would take nxn + rxrxk steps, which is much lower if k << nxn
Given r assigned and W stored as
[[1, 0, 0, 0, 1],
[1, 1, 1, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 1, 1],
[0, 1, 0, 0, 0]]
then following
output = [[ 0 for i in range(5) ] for j in range(5) ]
for i in range(len(W)):
for j in range(len(W[0])):
if W[i][j] == 1:
for off_i in range(-r,r+1):
for off_j in range(-r,r+1):
if (0 <= i+off_i < len(W)) and (0 <= j+off_j < len(W[0])):
output[i+off_i][j+off_j] += 1
stores required values in output
for r = 1, output is as required
[[3, 4, 2, 2, 1],
[4, 5, 2, 2, 1],
[3, 4, 3, 3, 2],
[2, 2, 2, 2, 2],
[1, 1, 2, 2, 2]]
I have this code:
import numpy as np
result = {}
result['depth'] = [1,1,1,2,2,2]
result['generation'] = [1,1,1,2,2,2]
result['dimension'] = [1,2,3,1,2,3]
result['data'] = [np.array([0,0,0]), np.array([0,0,0]), np.array([0,0,0]), np.array([0,0,0]), np.array([0,0,0]), np.array([0,0,0])]
for v in np.unique(result['depth']):
temp_v = (result['depth'] == v)
values_v = [result[string][temp_v] for string in result.keys()]
this_v = dict(zip(result.keys(), values_v))
in which I want to create a new dictcalled 'this_v', with the same keys as the original dict result, but fewer values.
The line:
values_v = [result[string][temp_v] for string in result.keys()]
gives an error
TypeError: only integer scalar arrays can be converted to a scalar index
which I don't understand, since I can create ex = result[result.keys()[0]][temp_v] just fine. It just does not let me do this with a for loop so that I can fill the list.
Any idea as to why it does not work?
In order to solve your problem (finding and dropping duplicates) I encourage you to use pandas. It is a Python module that makes your life absurdly simple:
import numpy as np
result = {}
result['depth'] = [1,1,1,2,2,2]
result['generation'] = [1,1,1,2,2,2]
result['dimension'] = [1,2,3,1,2,3]
result['data'] = [np.array([0,0,0]), np.array([0,0,0]), np.array([0,0,0]),\
np.array([0,0,0]), np.array([0,0,0]), np.array([0,0,0])]
# Here comes pandas!
import pandas as pd
# Converting your dictionary of lists into a beautiful dataframe
df = pd.DataFrame(result)
#> data depth dimension generation
# 0 [0, 0, 0] 1 1 1
# 1 [0, 0, 0] 1 2 1
# 2 [0, 0, 0] 1 3 1
# 3 [0, 0, 0] 2 1 2
# 4 [0, 0, 0] 2 2 2
# 5 [0, 0, 0] 2 3 2
# Dropping duplicates... in one single command!
df = df.drop_duplicates('depth')
#> data depth dimension generation
# 0 [0, 0, 0] 1 1 1
# 3 [0, 0, 0] 2 1 2
If you want oyur data back in the original format... you need yet again just one line of code!
df.to_dict('list')
#> {'data': [array([0, 0, 0]), array([0, 0, 0])],
# 'depth': [1, 2],
# 'dimension': [1, 1],
# 'generation': [1, 2]}
For example the binary table for 3 bit:
0 0 0
0 0 1
0 1 0
1 1 1
1 0 0
1 0 1
And I want to store this into an n*n*2 array so it would be:
0 0 0
0 0 1
0 1 0
1 1 1
1 0 0
1 0 1
For generating the combinations automatically, you can use itertools.product standard library, which generates all possible combinations of the different sequences which are supplied, i. e. the cartesian product across the input iterables. The repeat argument comes in handy as all of our sequences here are identical ranges.
from itertools import product
x = [i for i in product(range(2), repeat=3)]
Now if we want an array instead a list of tuples from that, we can just pass this to numpy.array.
import numpy as np
x = np.array(x)
# [[0 0 0]
# [0 0 1]
# [0 1 0]
# [0 1 1]
# [1 0 0]
# [1 0 1]
# [1 1 0]
# [1 1 1]]
If you want all elements in a single list, so you could index them with a single index, you could chain the iterable:
from itertools import chain, product
x = list(chain.from_iterable(product(range(2), repeat=3)))
result: [0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1]
Most people would expect 2^n x n as in
np.c_[tuple(i.ravel() for i in np.mgrid[:2,:2,:2])]
# array([[0, 0, 0],
# [0, 0, 1],
# [0, 1, 0],
# [0, 1, 1],
# [1, 0, 0],
# [1, 0, 1],
# [1, 1, 0],
# [1, 1, 1]])
Explanation: np.mgrid as used here creates the coordinates of the corners of a unit cube which happen to be all combinations of 0 and 1. The individual coordinates are then ravelled and joined as columns by np.c_
Here's a recursive, native python (no libraries) version of it:
def allBinaryPossiblities(maxLength, s=""):
if len(s) == maxLength:
return s
else:
temp = allBinaryPossiblities(maxLength, s + "0") + "\n"
temp += allBinaryPossiblities(maxLength, s + "1")
return temp
print (allBinaryPossiblities(3))
It prints all possible:
000
001
010
011
100
101
110
111