Brute force to generate possible permutations - python

I have 4 point groups, each of them contain 5 different 3D positions. My goal is to brut force all possible four permutations for each group without repeating the order and print them out as (5x3)array. E.g. for input data:
1,2,3
4,5,6
7,8,9
10,11,12
13,14,15
16,17,18
19,20,21
22,23,24
25,26,27
28,29,30
31,32,33
34,35,36
37,38,39
40,41,42
43,44,45
46,47,48
49,50,51
52,53,54
55,56, 57
58,59,60
I read the file:
def read_file(name):
with open(name, 'r') as f:
data = []
for line in f:
l = line.strip()
cols = [float(i) for i in line.split(',')]
data.append(cols)
return np.array(data)
and reshape it to have 4x(5x3) arrays to be brute-forced:
def main():
filePath= 'C:/Users/retw/input.txt'
data = read_file(filePath)
print('data:', data, type(data), data.shape)
reshapedData = data.reshape(4, 5, 3)
print('reshapedData :', reshapedData, type(reshapedData), reshapedData.shape)
The current output looks like:
respahedData: [[[ 1. 2. 3.]
[ 4. 5. 6.]
[ 7. 8. 9.]
[10. 11. 12.]
[13. 14. 15.]]
[[16. 17. 18.]
[19. 20. 21.]
[22. 23. 24.]
[25. 26. 27.]
[28. 29. 30.]]
[[31. 32. 33.]
[34. 35. 36.]
[37. 38. 39.]
[40. 41. 42.]
[43. 44. 45.]]
[[46. 47. 48.]
[49. 50. 51.]
[52. 53. 54.]
[55. 56. 57.]
[58. 59. 60.]]] <class 'numpy.ndarray'> (4, 5, 3)
after brut force, the permutations as array or list should look like:
[[1,2,3]
[16,17,18]
[31,32,33]
[46,47,48]]
[[1,2,3]
[19,20,21]
[31,32,33]
[46,47,48]]
[[1,2,3]
[22,23,24]
[31,32,33]
[46,47,48]]
etc,
until
[[13,14,15]
[28,29,30]
[43,44,45]
[58,59,60]]
Edit
For given two 4x3 arrays as input:
[[[1,2,3]
[4,5,6]]
[7,8,9]
[10,11,12]]]
The output after brute force should be:
[[1,2,3]
[7,8,9]]
[[1,2,3]
[10,11,12]]
[[4,5,6]
[7,8,9]]
[[4,5,6]
[10,11,12]]

Here is a solution using numpy and a generator that appears to work, generates the correct number of combos (625), and sequences them as you are looking for...
import numpy as np
f_in = 'data.csv'
data = []
with open(f_in, 'r') as f:
for line in f:
l = line.strip()
cols = [float(i) for i in line.split(',')]
data.append(cols)
data = np.array(data).reshape((4,5,3))
#print(data)
def result_gen(data):
odometer = [0, 0, 0, 0]
roll_seq = [1, 2, 3, 0] # the sequence of positions by which to roll the odometer
expired = False
while not expired:
res = data[[0, 1, 2, 3], [odometer]]
for i in roll_seq:
if odometer[i] < 4:
odometer[i] += 1
break
else:
if i == 0: # we have exhausted all combos
expired = True
odometer[i] = 0
yield res
my_gen = result_gen(data)
a = list(my_gen)
print(len(a))
for t in a[:6]:
print(t)
Yields:
625
[[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[19. 20. 21.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[22. 23. 24.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[25. 26. 27.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[28. 29. 30.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[16. 17. 18.]
[34. 35. 36.]
[46. 47. 48.]]]
[Finished in 0.2s]

Looks like you want to create something like this.
import numpy as np
a = np.arange(1,61).reshape(4,5,3)
print (a)
b = np.zeros((20,4,3))
for k in range(4):
for i in range(4):
for j in range(5):
if i == k:
b[5*k + j][i] = a[i][j]
else:
b[5*k + j][i] = a[i][0]
print (b)
The output of this will be:
[[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 4. 5. 6.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 7. 8. 9.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[10. 11. 12.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[13. 14. 15.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[19. 20. 21.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[22. 23. 24.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[25. 26. 27.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[28. 29. 30.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[34. 35. 36.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[37. 38. 39.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[40. 41. 42.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[43. 44. 45.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[49. 50. 51.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[52. 53. 54.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[55. 56. 57.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[58. 59. 60.]]]
There are a total of 20 arrays of 4 x 3 I could get looping through this.

Related

Manually calculate image gradient in tensorflow

I know there is image_gradients in tensorflow to get dx, dy of the image like this
dx, dy = tf.image.image_gradients(image)
print(image[0, :,:,0])
tf.Tensor(
[[ 0. 1. 2. 3. 4.]
[ 5. 6. 7. 8. 9.]
[10. 11. 12. 13. 14.]
[15. 16. 17. 18. 19.]
[20. 21. 22. 23. 24.]], shape=(5, 5), dtype=float32)
print(dx[0, :,:,0])
tf.Tensor(
[[5. 5. 5. 5. 5.]
[5. 5. 5. 5. 5.]
[5. 5. 5. 5. 5.]
[5. 5. 5. 5. 5.]
[0. 0. 0. 0. 0.]], shape=(5, 5), dtype=float32)
print(dy[0, :,:,0])
tf.Tensor(
[[1. 1. 1. 1. 0.]
[1. 1. 1. 1. 0.]
[1. 1. 1. 1. 0.]
[1. 1. 1. 1. 0.]
[1. 1. 1. 1. 0.]], shape=(5, 5), dtype=float32)
It looks like the gradient values are organized so that [I(x+1, y) - I(x, y)] is in location (x, y).
If I would like to do it manually, I'm not sure what I should do.
I tried to input the formula [I(x+1, y) - I(x, y)], but I have no idea how to implement it in the loop
x = image[0,:,:,0]
x_unpacked = tf.unstack(x)
processed = []
for t in x_unpacked:
???
processed.append(result_tensor)
output = tf.concat(processed, 0)
Or if I can shift the whole tensor to the x,y direction, I could do the tensor subtraction, but still not sure about how to handle the edge information. (Above example, they are all zero for the last row/column)
Any help would be appreciated.
for the above example,dx
dx = tf.pad(img[1:,] - img[:-1,], [[0,1],[0,0]])
for dy
dy = tf.pad(img[:,1:] - img[:,:-1], [[0,0],[0,1]])

ValueError: Found array with dim 3. Estimator expected <= 2 python

I am trying to perform decision trees with some train and test data which are in lists named x&y.
my train data x is this:
[array([[19. , 14. , 0.8],
[23. , 24. , 0.8],
[25. , 26. , 0.8],
[22. , 24. , 1. ],
[25. , 29. , 1.4],
[36. , 86. , 1.6],
[28. , 52. , 0.8],
[21. , 20. , 1. ],
[22. , 28. , 0.8],
[24. , 27. , 1. ],
[18. , 8. , 0.6],
[30. , 58. , 1.2],
[24. , 30. , 0.8],
[24. , 28. , 0.8],
[32. , 65. , 1.6],
[28. , 47. , 0.8],
[26. , 41. , 0.8],
[18. , 14. , 0.6],
[32. , 71. , 2.2],
[27. , 45. , 2. ],
[29. , 53. , 2.2],
[18. , 11. , 0.8],
[20. , 23. , 0.8],
[20. , 19. , 0.6],
[20. , 15. , 0.6],
[19. , 18. , 0.4],
[24. , 55. , 1.2],
[24. , 59. , 1. ],
[20. , 17. , 0.6],
[21. , 28. , 0.8]])]
and y:
[array([ 3100., 2750., 7800., 6000., 15000., 15500., 5600., 8000.,
6000., 7500., 4000., 9000., 5850., 5750., 18000., 5600.,
5600., 4500., 22000., 21500., 24000., 4000., 6000., 4000.,
8000., 8000., 14000., 14000., 6000., 4000.])]
when i try to perform
dtree= DecisionTreeRegressor(random_state=0, max_depth=1)
dtree.fit(x_train, y_train)
I get the error ValueError: Found array with dim 3. Estimator expected <= 2. and couldn't solve it with reshape since these are lists. any suggestions?
First of all, I recommend you to convert X and Y as numpy arrays, but I can not be 100% sure if your variables are indeed, since you haven't uploaded your code here. Secondly, take a look at your variables. As it says in the page:
X{array-like, sparse matrix} of shape (n_samples, n_features)
AND
yarray-like of shape (n_samples,) or (n_samples, n_outputs)
fit function expects 2D arrays in both X and Y arrays. And X_train is 3D.
So you need to reshape these two. One solution can be:
AMSWER EDITTED AFTER READING HIS/HER COMMENTS
The reason why you can't train your data is because 2 things:
X_train has a bad shape
Y_train has a bad shape
Your are passing a 3D array with X_train, and fit only allows you to be 2D. Furthermore, your Y_train has shape (1, 30) which means you are passing 30 data at once. You need to separate them and passing as (30, ), as follows:
from sklearn.tree import DecisionTreeRegressor
import numpy as np
X_train = np.array([np.array([[19. , 14. , 0.8],
[23. , 24. , 0.8],
[25. , 26. , 0.8],
[22. , 24. , 1. ],
[25. , 29. , 1.4],
[36. , 86. , 1.6],
[28. , 52. , 0.8],
[21. , 20. , 1. ],
[22. , 28. , 0.8],
[24. , 27. , 1. ],
[18. , 8. , 0.6],
[30. , 58. , 1.2],
[24. , 30. , 0.8],
[24. , 28. , 0.8],
[32. , 65. , 1.6],
[28. , 47. , 0.8],
[26. , 41. , 0.8],
[18. , 14. , 0.6],
[32. , 71. , 2.2],
[27. , 45. , 2. ],
[29. , 53. , 2.2],
[18. , 11. , 0.8],
[20. , 23. , 0.8],
[20. , 19. , 0.6],
[20. , 15. , 0.6],
[19. , 18. , 0.4],
[24. , 55. , 1.2],
[24. , 59. , 1. ],
[20. , 17. , 0.6],
[21. , 28. , 0.8]])])
dimX1, dimX2, dimX3 = np.array(X_train).shape
X_train = np.reshape(np.array(X_train), (dimX1*dimX2, dimX3))
Y_train = np.array([np.array([ 3100., 2750., 7800., 6000., 15000., 15500., 5600., 8000.,
6000., 7500., 4000., 9000., 5850., 5750., 18000., 5600.,
5600., 4500., 22000., 21500., 24000., 4000., 6000., 4000.,
8000., 8000., 14000., 14000., 6000., 4000.])])
dimY1, dimY2 = Y_train.shape
Y_train = np.reshape(np.array(Y_train), (dimY2, ))
print(X_train.shape, Y_train.shape)
dtree= DecisionTreeRegressor(random_state=0, max_depth=1)
dtree.fit(X_train, Y_train)
Its output is:
>>> (30, 3) (30,)
>>> DecisionTreeRegressor(ccp_alpha=0.0, criterion='mse', max_depth=1,
max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, presort='deprecated',
random_state=0, splitter='best')

Why doesn't assignment work on this DataFrame

I want to create a copy df with different values based on the previous one. I have used this technique before and it worked just fine, however it doesn't work here.
Does anyone know if I am missing something?
Code:
df2 = df1.copy()
for index, row in df2.iterrows():
flowers_num = int(row["flowers_num"])
if flowers_num >= 100:
flowers_num = 10
elif flowers_num >= 10:
flowers_num = 8
else:
flowers_num = 6
row["flowers_num"] = flowers_num
Unique values on df2 before loop:
[ 0. 1. 10. 15. 6. 2. 4. 3. 44. 8. 9. 7. 22. 5.
11. 19. 12. 13. 21. 20. 14. 23. 16. 18. 24. 17. 35. 32.
25. 30. 28. 57. 45. 27. 42. 38. 43. 37. 34. 26. 29. 41.
52. 31. 39. 46. 51. 131. 36. 61. 53. 33. 48. 40. 58. 49.
76. 50. 119. 55. 91. 59. 106. 56. 65. 54. 47. 63. 64. 67.
75. 102. 74. 70. 60.]
Unique values on df2 after loop (should be just 6, 8 or 10):
[ 0. 1. 10. 15. 6. 2. 4. 3. 44. 8. 9. 7. 22. 5.
11. 19. 12. 13. 21. 20. 14. 23. 16. 18. 24. 17. 35. 32.
25. 30. 28. 57. 45. 27. 42. 38. 43. 37. 34. 26. 29. 41.
52. 31. 39. 46. 51. 131. 36. 61. 53. 33. 48. 40. 58. 49.
76. 50. 119. 55. 91. 59. 106. 56. 65. 54. 47. 63. 64. 67.
75. 102. 74. 70. 60.]
Thanks in advance!
Your coded worked from me, however, the "pandas" way to do this is to use pd.cut:
pd.cut(df1['flowers_num'], [0,10,100,np.inf], labels=[6,8,10])
It would be much faster if you use apply on the column rather than iterrows.
Create a function to change the values
def change_num(x):
if x>=100:
return 10
elif x>=10:
return 8
else:
return 6
Dummy DataFrame:
df_ex = pd.DataFrame({'flowers_num': np.random.randint(1,1000,20)})
Using apply:
df_ex["flowers_num"]=df_ex["flowers_num"].apply(change_num)

NumPy: impute mean of the two nearest rows for all NaN

I have a NumPy array with missing values. I want to impute the mean of the nearest values vertically.
import numpy as np
arr = np.random.randint(0, 10, (10, 4)).astype(float)
arr[2, 0] = np.nan
arr[4, 3] = np.nan
arr[0, 2] = np.nan
print(arr)
[[ 5. 7. nan 4.] # should be 4
[ 2. 6. 4. 9.]
[nan 2. 5. 5.] # should be 4.5
[ 7. 0. 3. 8.]
[ 6. 4. 3. nan] # should be 4
[ 8. 1. 2. 0.]
[ 0. 0. 1. 1.]
[ 1. 2. 6. 6.]
[ 8. 1. 9. 7.]
[ 3. 5. 8. 8.]]
If you are open to using Pandas, pd.DataFrame.interpolate is easy to use. Set limit_direction if "interpolating" values at ends of array:
df = pd.DataFrame(arr).interpolate(limit_direction='both')
df.to_numpy() # back to a numpy array if needed (if using v0.24.0 or above)
Output:
array([[5. , 7. , 4. , 4. ],
[2. , 6. , 4. , 9. ],
[4.5, 2. , 5. , 5. ],
[7. , 0. , 3. , 8. ],
[6. , 4. , 3. , 4. ],
[8. , 1. , 2. , 0. ],
[0. , 0. , 1. , 1. ],
[1. , 2. , 6. , 6. ],
[8. , 1. , 9. , 7. ],
[3. , 5. , 8. , 8. ]])
import numpy as np
arr = np.random.randint(0, 10, (10, 4)).astype(float)
arr[2, 0] = np.nan
arr[4, 3] = np.nan
arr[0, 2] = np.nan
print(arr)
[[ 5. 7. nan 4.]
[ 2. 6. 4. 9.]
[nan 2. 5. 5.]
[ 7. 0. 3. 8.]
[ 6. 4. 3. nan]
[ 8. 1. 2. 0.]
[ 0. 0. 1. 1.]
[ 1. 2. 6. 6.]
[ 8. 1. 9. 7.]
[ 3. 5. 8. 8.]]
for x, y in np.argwhere(np.isnan(arr)):
sample = arr[np.maximum(x - 1, 0):np.minimum(x + 2, 20), y]
arr[x, y] = np.mean(sample[np.logical_not(np.isnan(sample))])
print(arr)
[[5. 7. 4. 4. ] # 3rd value here is mean(4)
[2. 6. 4. 9. ]
[4.5 2. 5. 5. ] # first value here is mean(2, 7)
[7. 0. 3. 8. ]
[6. 4. 3. 4. ] # 4th value here is mean(8, 0)
[8. 1. 2. 0. ]
[0. 0. 1. 1. ]
[1. 2. 6. 6. ]
[8. 1. 9. 7. ]
[3. 5. 8. 8. ]]

Extract non-main diagonal from scipy sparse matrix?

Say that I have a sparse matrix in scipy.sparse format. How can I extract a diagonal other than than the main diagonal? For a numpy array, you can use numpy.diag. Is there a scipy sparse equivalent?
For example:
from scipy import sparse
A = sparse.diags(ones(5),1)
How would I get back the vector of ones without converting to a numpy array?
When the sparse array is in dia format, the data along the diagonals is recorded in the offsets and data attributes:
import scipy.sparse as sparse
import numpy as np
def make_sparse_array():
A = np.arange(ncol*nrow).reshape(nrow, ncol)
row, col = zip(*np.ndindex(nrow, ncol))
val = A.ravel()
A = sparse.coo_matrix(
(val, (row, col)), shape=(nrow, ncol), dtype='float')
A = A.todia()
# A = sparse.diags(np.ones(5), 1)
# A = sparse.diags([np.ones(4),np.ones(3)*2,], [2,3])
print(A.toarray())
return A
nrow, ncol = 10, 5
A = make_sparse_array()
diags = {offset:(diag[offset:nrow+offset] if 0<=offset<=ncol else
diag if offset+nrow-ncol>=0 else
diag[:offset+nrow-ncol])
for offset, diag in zip(A.offsets, A.data)}
for offset, diag in sorted(diags.iteritems()):
print('{o}: {d}'.format(o=offset, d=diag))
Thus for the array
[[ 0. 1. 2. 3. 4.]
[ 5. 6. 7. 8. 9.]
[ 10. 11. 12. 13. 14.]
[ 15. 16. 17. 18. 19.]
[ 20. 21. 22. 23. 24.]
[ 25. 26. 27. 28. 29.]
[ 30. 31. 32. 33. 34.]
[ 35. 36. 37. 38. 39.]
[ 40. 41. 42. 43. 44.]
[ 45. 46. 47. 48. 49.]]
the code above yields
-9: [ 45.]
-8: [ 40. 46.]
-7: [ 35. 41. 47.]
-6: [ 30. 36. 42. 48.]
-5: [ 25. 31. 37. 43. 49.]
-4: [ 20. 26. 32. 38. 44.]
-3: [ 15. 21. 27. 33. 39.]
-2: [ 10. 16. 22. 28. 34.]
-1: [ 5. 11. 17. 23. 29.]
0: [ 0. 6. 12. 18. 24.]
1: [ 1. 7. 13. 19.]
2: [ 2. 8. 14.]
3: [ 3. 9.]
4: [ 4.]
The output above is printing the offset followed by the diagonal at that offset.
The code above should work for any sparse array. I used a fully populated sparse array only to make it easier to check that the output is correct.

Categories