Brute force to generate possible permutations - python
I have 4 point groups, each of them contain 5 different 3D positions. My goal is to brut force all possible four permutations for each group without repeating the order and print them out as (5x3)array. E.g. for input data:
1,2,3
4,5,6
7,8,9
10,11,12
13,14,15
16,17,18
19,20,21
22,23,24
25,26,27
28,29,30
31,32,33
34,35,36
37,38,39
40,41,42
43,44,45
46,47,48
49,50,51
52,53,54
55,56, 57
58,59,60
I read the file:
def read_file(name):
with open(name, 'r') as f:
data = []
for line in f:
l = line.strip()
cols = [float(i) for i in line.split(',')]
data.append(cols)
return np.array(data)
and reshape it to have 4x(5x3) arrays to be brute-forced:
def main():
filePath= 'C:/Users/retw/input.txt'
data = read_file(filePath)
print('data:', data, type(data), data.shape)
reshapedData = data.reshape(4, 5, 3)
print('reshapedData :', reshapedData, type(reshapedData), reshapedData.shape)
The current output looks like:
respahedData: [[[ 1. 2. 3.]
[ 4. 5. 6.]
[ 7. 8. 9.]
[10. 11. 12.]
[13. 14. 15.]]
[[16. 17. 18.]
[19. 20. 21.]
[22. 23. 24.]
[25. 26. 27.]
[28. 29. 30.]]
[[31. 32. 33.]
[34. 35. 36.]
[37. 38. 39.]
[40. 41. 42.]
[43. 44. 45.]]
[[46. 47. 48.]
[49. 50. 51.]
[52. 53. 54.]
[55. 56. 57.]
[58. 59. 60.]]] <class 'numpy.ndarray'> (4, 5, 3)
after brut force, the permutations as array or list should look like:
[[1,2,3]
[16,17,18]
[31,32,33]
[46,47,48]]
[[1,2,3]
[19,20,21]
[31,32,33]
[46,47,48]]
[[1,2,3]
[22,23,24]
[31,32,33]
[46,47,48]]
etc,
until
[[13,14,15]
[28,29,30]
[43,44,45]
[58,59,60]]
Edit
For given two 4x3 arrays as input:
[[[1,2,3]
[4,5,6]]
[7,8,9]
[10,11,12]]]
The output after brute force should be:
[[1,2,3]
[7,8,9]]
[[1,2,3]
[10,11,12]]
[[4,5,6]
[7,8,9]]
[[4,5,6]
[10,11,12]]
Here is a solution using numpy and a generator that appears to work, generates the correct number of combos (625), and sequences them as you are looking for...
import numpy as np
f_in = 'data.csv'
data = []
with open(f_in, 'r') as f:
for line in f:
l = line.strip()
cols = [float(i) for i in line.split(',')]
data.append(cols)
data = np.array(data).reshape((4,5,3))
#print(data)
def result_gen(data):
odometer = [0, 0, 0, 0]
roll_seq = [1, 2, 3, 0] # the sequence of positions by which to roll the odometer
expired = False
while not expired:
res = data[[0, 1, 2, 3], [odometer]]
for i in roll_seq:
if odometer[i] < 4:
odometer[i] += 1
break
else:
if i == 0: # we have exhausted all combos
expired = True
odometer[i] = 0
yield res
my_gen = result_gen(data)
a = list(my_gen)
print(len(a))
for t in a[:6]:
print(t)
Yields:
625
[[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[19. 20. 21.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[22. 23. 24.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[25. 26. 27.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[28. 29. 30.]
[31. 32. 33.]
[46. 47. 48.]]]
[[[ 1. 2. 3.]
[16. 17. 18.]
[34. 35. 36.]
[46. 47. 48.]]]
[Finished in 0.2s]
Looks like you want to create something like this.
import numpy as np
a = np.arange(1,61).reshape(4,5,3)
print (a)
b = np.zeros((20,4,3))
for k in range(4):
for i in range(4):
for j in range(5):
if i == k:
b[5*k + j][i] = a[i][j]
else:
b[5*k + j][i] = a[i][0]
print (b)
The output of this will be:
[[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 4. 5. 6.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 7. 8. 9.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[10. 11. 12.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[13. 14. 15.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[19. 20. 21.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[22. 23. 24.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[25. 26. 27.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[28. 29. 30.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[34. 35. 36.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[37. 38. 39.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[40. 41. 42.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[43. 44. 45.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[46. 47. 48.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[49. 50. 51.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[52. 53. 54.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[55. 56. 57.]]
[[ 1. 2. 3.]
[16. 17. 18.]
[31. 32. 33.]
[58. 59. 60.]]]
There are a total of 20 arrays of 4 x 3 I could get looping through this.
Related
Manually calculate image gradient in tensorflow
I know there is image_gradients in tensorflow to get dx, dy of the image like this dx, dy = tf.image.image_gradients(image) print(image[0, :,:,0]) tf.Tensor( [[ 0. 1. 2. 3. 4.] [ 5. 6. 7. 8. 9.] [10. 11. 12. 13. 14.] [15. 16. 17. 18. 19.] [20. 21. 22. 23. 24.]], shape=(5, 5), dtype=float32) print(dx[0, :,:,0]) tf.Tensor( [[5. 5. 5. 5. 5.] [5. 5. 5. 5. 5.] [5. 5. 5. 5. 5.] [5. 5. 5. 5. 5.] [0. 0. 0. 0. 0.]], shape=(5, 5), dtype=float32) print(dy[0, :,:,0]) tf.Tensor( [[1. 1. 1. 1. 0.] [1. 1. 1. 1. 0.] [1. 1. 1. 1. 0.] [1. 1. 1. 1. 0.] [1. 1. 1. 1. 0.]], shape=(5, 5), dtype=float32) It looks like the gradient values are organized so that [I(x+1, y) - I(x, y)] is in location (x, y). If I would like to do it manually, I'm not sure what I should do. I tried to input the formula [I(x+1, y) - I(x, y)], but I have no idea how to implement it in the loop x = image[0,:,:,0] x_unpacked = tf.unstack(x) processed = [] for t in x_unpacked: ??? processed.append(result_tensor) output = tf.concat(processed, 0) Or if I can shift the whole tensor to the x,y direction, I could do the tensor subtraction, but still not sure about how to handle the edge information. (Above example, they are all zero for the last row/column) Any help would be appreciated.
for the above example,dx dx = tf.pad(img[1:,] - img[:-1,], [[0,1],[0,0]]) for dy dy = tf.pad(img[:,1:] - img[:,:-1], [[0,0],[0,1]])
ValueError: Found array with dim 3. Estimator expected <= 2 python
I am trying to perform decision trees with some train and test data which are in lists named x&y. my train data x is this: [array([[19. , 14. , 0.8], [23. , 24. , 0.8], [25. , 26. , 0.8], [22. , 24. , 1. ], [25. , 29. , 1.4], [36. , 86. , 1.6], [28. , 52. , 0.8], [21. , 20. , 1. ], [22. , 28. , 0.8], [24. , 27. , 1. ], [18. , 8. , 0.6], [30. , 58. , 1.2], [24. , 30. , 0.8], [24. , 28. , 0.8], [32. , 65. , 1.6], [28. , 47. , 0.8], [26. , 41. , 0.8], [18. , 14. , 0.6], [32. , 71. , 2.2], [27. , 45. , 2. ], [29. , 53. , 2.2], [18. , 11. , 0.8], [20. , 23. , 0.8], [20. , 19. , 0.6], [20. , 15. , 0.6], [19. , 18. , 0.4], [24. , 55. , 1.2], [24. , 59. , 1. ], [20. , 17. , 0.6], [21. , 28. , 0.8]])] and y: [array([ 3100., 2750., 7800., 6000., 15000., 15500., 5600., 8000., 6000., 7500., 4000., 9000., 5850., 5750., 18000., 5600., 5600., 4500., 22000., 21500., 24000., 4000., 6000., 4000., 8000., 8000., 14000., 14000., 6000., 4000.])] when i try to perform dtree= DecisionTreeRegressor(random_state=0, max_depth=1) dtree.fit(x_train, y_train) I get the error ValueError: Found array with dim 3. Estimator expected <= 2. and couldn't solve it with reshape since these are lists. any suggestions?
First of all, I recommend you to convert X and Y as numpy arrays, but I can not be 100% sure if your variables are indeed, since you haven't uploaded your code here. Secondly, take a look at your variables. As it says in the page: X{array-like, sparse matrix} of shape (n_samples, n_features) AND yarray-like of shape (n_samples,) or (n_samples, n_outputs) fit function expects 2D arrays in both X and Y arrays. And X_train is 3D. So you need to reshape these two. One solution can be: AMSWER EDITTED AFTER READING HIS/HER COMMENTS The reason why you can't train your data is because 2 things: X_train has a bad shape Y_train has a bad shape Your are passing a 3D array with X_train, and fit only allows you to be 2D. Furthermore, your Y_train has shape (1, 30) which means you are passing 30 data at once. You need to separate them and passing as (30, ), as follows: from sklearn.tree import DecisionTreeRegressor import numpy as np X_train = np.array([np.array([[19. , 14. , 0.8], [23. , 24. , 0.8], [25. , 26. , 0.8], [22. , 24. , 1. ], [25. , 29. , 1.4], [36. , 86. , 1.6], [28. , 52. , 0.8], [21. , 20. , 1. ], [22. , 28. , 0.8], [24. , 27. , 1. ], [18. , 8. , 0.6], [30. , 58. , 1.2], [24. , 30. , 0.8], [24. , 28. , 0.8], [32. , 65. , 1.6], [28. , 47. , 0.8], [26. , 41. , 0.8], [18. , 14. , 0.6], [32. , 71. , 2.2], [27. , 45. , 2. ], [29. , 53. , 2.2], [18. , 11. , 0.8], [20. , 23. , 0.8], [20. , 19. , 0.6], [20. , 15. , 0.6], [19. , 18. , 0.4], [24. , 55. , 1.2], [24. , 59. , 1. ], [20. , 17. , 0.6], [21. , 28. , 0.8]])]) dimX1, dimX2, dimX3 = np.array(X_train).shape X_train = np.reshape(np.array(X_train), (dimX1*dimX2, dimX3)) Y_train = np.array([np.array([ 3100., 2750., 7800., 6000., 15000., 15500., 5600., 8000., 6000., 7500., 4000., 9000., 5850., 5750., 18000., 5600., 5600., 4500., 22000., 21500., 24000., 4000., 6000., 4000., 8000., 8000., 14000., 14000., 6000., 4000.])]) dimY1, dimY2 = Y_train.shape Y_train = np.reshape(np.array(Y_train), (dimY2, )) print(X_train.shape, Y_train.shape) dtree= DecisionTreeRegressor(random_state=0, max_depth=1) dtree.fit(X_train, Y_train) Its output is: >>> (30, 3) (30,) >>> DecisionTreeRegressor(ccp_alpha=0.0, criterion='mse', max_depth=1, max_features=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, presort='deprecated', random_state=0, splitter='best')
Why doesn't assignment work on this DataFrame
I want to create a copy df with different values based on the previous one. I have used this technique before and it worked just fine, however it doesn't work here. Does anyone know if I am missing something? Code: df2 = df1.copy() for index, row in df2.iterrows(): flowers_num = int(row["flowers_num"]) if flowers_num >= 100: flowers_num = 10 elif flowers_num >= 10: flowers_num = 8 else: flowers_num = 6 row["flowers_num"] = flowers_num Unique values on df2 before loop: [ 0. 1. 10. 15. 6. 2. 4. 3. 44. 8. 9. 7. 22. 5. 11. 19. 12. 13. 21. 20. 14. 23. 16. 18. 24. 17. 35. 32. 25. 30. 28. 57. 45. 27. 42. 38. 43. 37. 34. 26. 29. 41. 52. 31. 39. 46. 51. 131. 36. 61. 53. 33. 48. 40. 58. 49. 76. 50. 119. 55. 91. 59. 106. 56. 65. 54. 47. 63. 64. 67. 75. 102. 74. 70. 60.] Unique values on df2 after loop (should be just 6, 8 or 10): [ 0. 1. 10. 15. 6. 2. 4. 3. 44. 8. 9. 7. 22. 5. 11. 19. 12. 13. 21. 20. 14. 23. 16. 18. 24. 17. 35. 32. 25. 30. 28. 57. 45. 27. 42. 38. 43. 37. 34. 26. 29. 41. 52. 31. 39. 46. 51. 131. 36. 61. 53. 33. 48. 40. 58. 49. 76. 50. 119. 55. 91. 59. 106. 56. 65. 54. 47. 63. 64. 67. 75. 102. 74. 70. 60.] Thanks in advance!
Your coded worked from me, however, the "pandas" way to do this is to use pd.cut: pd.cut(df1['flowers_num'], [0,10,100,np.inf], labels=[6,8,10])
It would be much faster if you use apply on the column rather than iterrows. Create a function to change the values def change_num(x): if x>=100: return 10 elif x>=10: return 8 else: return 6 Dummy DataFrame: df_ex = pd.DataFrame({'flowers_num': np.random.randint(1,1000,20)}) Using apply: df_ex["flowers_num"]=df_ex["flowers_num"].apply(change_num)
NumPy: impute mean of the two nearest rows for all NaN
I have a NumPy array with missing values. I want to impute the mean of the nearest values vertically. import numpy as np arr = np.random.randint(0, 10, (10, 4)).astype(float) arr[2, 0] = np.nan arr[4, 3] = np.nan arr[0, 2] = np.nan print(arr) [[ 5. 7. nan 4.] # should be 4 [ 2. 6. 4. 9.] [nan 2. 5. 5.] # should be 4.5 [ 7. 0. 3. 8.] [ 6. 4. 3. nan] # should be 4 [ 8. 1. 2. 0.] [ 0. 0. 1. 1.] [ 1. 2. 6. 6.] [ 8. 1. 9. 7.] [ 3. 5. 8. 8.]]
If you are open to using Pandas, pd.DataFrame.interpolate is easy to use. Set limit_direction if "interpolating" values at ends of array: df = pd.DataFrame(arr).interpolate(limit_direction='both') df.to_numpy() # back to a numpy array if needed (if using v0.24.0 or above) Output: array([[5. , 7. , 4. , 4. ], [2. , 6. , 4. , 9. ], [4.5, 2. , 5. , 5. ], [7. , 0. , 3. , 8. ], [6. , 4. , 3. , 4. ], [8. , 1. , 2. , 0. ], [0. , 0. , 1. , 1. ], [1. , 2. , 6. , 6. ], [8. , 1. , 9. , 7. ], [3. , 5. , 8. , 8. ]])
import numpy as np arr = np.random.randint(0, 10, (10, 4)).astype(float) arr[2, 0] = np.nan arr[4, 3] = np.nan arr[0, 2] = np.nan print(arr) [[ 5. 7. nan 4.] [ 2. 6. 4. 9.] [nan 2. 5. 5.] [ 7. 0. 3. 8.] [ 6. 4. 3. nan] [ 8. 1. 2. 0.] [ 0. 0. 1. 1.] [ 1. 2. 6. 6.] [ 8. 1. 9. 7.] [ 3. 5. 8. 8.]] for x, y in np.argwhere(np.isnan(arr)): sample = arr[np.maximum(x - 1, 0):np.minimum(x + 2, 20), y] arr[x, y] = np.mean(sample[np.logical_not(np.isnan(sample))]) print(arr) [[5. 7. 4. 4. ] # 3rd value here is mean(4) [2. 6. 4. 9. ] [4.5 2. 5. 5. ] # first value here is mean(2, 7) [7. 0. 3. 8. ] [6. 4. 3. 4. ] # 4th value here is mean(8, 0) [8. 1. 2. 0. ] [0. 0. 1. 1. ] [1. 2. 6. 6. ] [8. 1. 9. 7. ] [3. 5. 8. 8. ]]
Extract non-main diagonal from scipy sparse matrix?
Say that I have a sparse matrix in scipy.sparse format. How can I extract a diagonal other than than the main diagonal? For a numpy array, you can use numpy.diag. Is there a scipy sparse equivalent? For example: from scipy import sparse A = sparse.diags(ones(5),1) How would I get back the vector of ones without converting to a numpy array?
When the sparse array is in dia format, the data along the diagonals is recorded in the offsets and data attributes: import scipy.sparse as sparse import numpy as np def make_sparse_array(): A = np.arange(ncol*nrow).reshape(nrow, ncol) row, col = zip(*np.ndindex(nrow, ncol)) val = A.ravel() A = sparse.coo_matrix( (val, (row, col)), shape=(nrow, ncol), dtype='float') A = A.todia() # A = sparse.diags(np.ones(5), 1) # A = sparse.diags([np.ones(4),np.ones(3)*2,], [2,3]) print(A.toarray()) return A nrow, ncol = 10, 5 A = make_sparse_array() diags = {offset:(diag[offset:nrow+offset] if 0<=offset<=ncol else diag if offset+nrow-ncol>=0 else diag[:offset+nrow-ncol]) for offset, diag in zip(A.offsets, A.data)} for offset, diag in sorted(diags.iteritems()): print('{o}: {d}'.format(o=offset, d=diag)) Thus for the array [[ 0. 1. 2. 3. 4.] [ 5. 6. 7. 8. 9.] [ 10. 11. 12. 13. 14.] [ 15. 16. 17. 18. 19.] [ 20. 21. 22. 23. 24.] [ 25. 26. 27. 28. 29.] [ 30. 31. 32. 33. 34.] [ 35. 36. 37. 38. 39.] [ 40. 41. 42. 43. 44.] [ 45. 46. 47. 48. 49.]] the code above yields -9: [ 45.] -8: [ 40. 46.] -7: [ 35. 41. 47.] -6: [ 30. 36. 42. 48.] -5: [ 25. 31. 37. 43. 49.] -4: [ 20. 26. 32. 38. 44.] -3: [ 15. 21. 27. 33. 39.] -2: [ 10. 16. 22. 28. 34.] -1: [ 5. 11. 17. 23. 29.] 0: [ 0. 6. 12. 18. 24.] 1: [ 1. 7. 13. 19.] 2: [ 2. 8. 14.] 3: [ 3. 9.] 4: [ 4.] The output above is printing the offset followed by the diagonal at that offset. The code above should work for any sparse array. I used a fully populated sparse array only to make it easier to check that the output is correct.