I have a nested list that contains 1002 time steps and in each time step, I have observation of 11 features. I have read docs related to padding but I really could not find out how to add zero elements at the end of each list. I found out the highest length of lists is for example the 24th item in my main list and now I want to pad all the rest elements based on this unless the 24th element that already in shape.As an example:
a = [[1,2,3,4,5,6,76,7],[2,2,3,4,2,5,5,5,,7,8,9,33,677,8,8,9,9],[2,3,46,7,8,9,],[3,3,3,5],[2,2],[1,1],[2,2]]
a[1] = padding(a[1],len(a[2]) with zeros at the end of the list)
I have done below:
import numpy as np
def pad_or_truncate(some_list, target_len):
return some_list[:target_len] + [0]*(target_len - len(some_list))
for i in range(len(Length)):
pad_or_truncate(Length[i],len(Length[24]))
print(len(Length[i]))
or
for i in range(len(Length)):
df_train_array = np.pad(Length[i],len(Length[24]),mode='constant')
and I got this error: Unable to coerce to Series, length must be 11: given 375
Solution 1
# set the max number of 0
max_len = max([len(x) for x in a])
# add zeros to the lists
temp = [x+ [0]*max_len for x in a]
#Limit the output to the wished length
[x[0:max_len] for x in temp]
Solution 2 using pandas
import pandas as pd
df = pd.DataFrame(a)
df.fillna(0).astype(int).values.tolist()
Output
[[1, 2, 3, 4, 5, 6, 76, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[2, 2, 3, 4, 2, 5, 5, 5, 7, 8, 9, 33, 677, 8, 8, 9, 9],
[2, 3, 46, 7, 8, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[3, 3, 3, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
...]
The following code snippet should pad the individual lists with the appropriate number of 0s(driven by the size of the list with the maximum elements)
def main():
data = [
[1,2,3,4,5,6,76,7],
[2,2,3,4,2,5,5,5,7,8,9,33,677,8,8,9,9],
[2,3,46,7,8,9,],
[3,3,3,5],
[2,2],
[1,1],
[2,2]
]
# find the list with the maximum elements
max_length = max(map(len, data))
for element in data:
for _ in range(len(element), max_length):
element.append(0)
if __name__ == '__main__':
main()
You can use this simple line, which uses np.pad
list(map(lambda x: np.pad(x, (max(map(len, a)) - len(x), 0)).tolist(), a))
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 76, 7],
[2, 2, 3, 4, 2, 5, 5, 5, 7, 8, 9, 33, 677, 8, 8, 9, 9],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 3, 46, 7, 8, 9],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 3, 5],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2]]
Use this if you want to pad at the end instead:
list(map(lambda x: np.pad(x, (0, max(map(len, a)) - len(x))).tolist(), a))
Related
Say I have a list containing lists.
board = [[5, 3, 0, 0, 7, 0, 1, 0, 0],
[6, 0, 0, 1, 9, 5, 0, 0, 0],
[0, 9, 8, 0, 0, 0, 0, 6, 0],
[8, 0, 0, 0, 6, 0, 0, 0, 3],
[4, 0, 0, 8, 0, 3, 0, 0, 1],
[7, 0, 0, 0, 2, 0, 0, 0, 6],
[0, 6, 0, 0, 0, 0, 2, 8, 0],
[0, 0, 0, 4, 1, 9, 0, 0, 5],
[0, 0, 0, 0, 8, 0, 0, 7, 9]]
I want to append every three numbers in each sublist to a key in a dictionary.
For example
dd = {1:[5, 3, 0,6, 0, 0,0, 9, 8]}
Then I'll look for the next 3x3 section.
dd = {2:[0,7,0,1,9,5,0,0,0]}
In total I should have 9 keys each with a list of 9 elements.
This clearly doesn't work XD:
board = [[5, 3, 0, 0, 7, 0, 1, 0, 0],
[6, 0, 0, 1, 9, 5, 0, 0, 0],
[0, 9, 8, 0, 0, 0, 0, 6, 0],
[8, 0, 0, 0, 6, 0, 0, 0, 3],
[4, 0, 0, 8, 0, 3, 0, 0, 1],
[7, 0, 0, 0, 2, 0, 0, 0, 6],
[0, 6, 0, 0, 0, 0, 2, 8, 0],
[0, 0, 0, 4, 1, 9, 0, 0, 5],
[0, 0, 0, 0, 8, 0, 0, 7, 9]]
xBx = {}
count1 = 1
count2 = 3
count3 = 0
for row in board:
if count1 == 4:
count = 1
xBx[count1] = row[count3:count2]
count1 += 1
count2 += 3
count3 += 3
print(xBx)
Using numpy, it's easier to slice a multi-dimensional array
import numpy as np
board = np.array(board)
result = {}
for i in range(len(board) // size):
for j in range(len(board) // size):
values = board[j * size:(j + 1) * size, i * size:(i + 1) * size]
result[i * size + j + 1] = values.flatten()
And result gives
{
1: array([5, 3, 0, 6, 0, 0, 0, 9, 8]),
2: array([8, 0, 0, 4, 0, 0, 7, 0, 0]),
3: array([0, 6, 0, 0, 0, 0, 0, 0, 0]),
4: array([0, 7, 0, 1, 9, 5, 0, 0, 0]),
5: array([0, 6, 0, 8, 0, 3, 0, 2, 0]),
6: array([0, 0, 0, 4, 1, 9, 0, 8, 0]),
7: array([1, 0, 0, 0, 0, 0, 0, 6, 0]),
8: array([0, 0, 3, 0, 0, 1, 0, 0, 6]),
9: array([2, 8, 0, 0, 0, 5, 0, 7, 9])
}
This should work:
board = [[5, 3, 0, 0, 7, 0, 1, 0, 0],
[6, 0, 0, 1, 9, 5, 0, 0, 0],
[0, 9, 8, 0, 0, 0, 0, 6, 0],
[8, 0, 0, 0, 6, 0, 0, 0, 3],
[4, 0, 0, 8, 0, 3, 0, 0, 1],
[7, 0, 0, 0, 2, 0, 0, 0, 6],
[0, 6, 0, 0, 0, 0, 2, 8, 0],
[0, 0, 0, 4, 1, 9, 0, 0, 5],
[0, 0, 0, 0, 8, 0, 0, 7, 9]]
d = {}
for i in range(len(board[0]) // 3):
for l in board:
if i + 1 in d.keys():
d[i + 1].append(l[0 + i*3])
d[i + 1].append(l[1 + i*3])
d[i + 1].append(l[2 + i*3])
else:
d[i + 1] = []
d[i + 1].append(l[0 + i*3])
d[i + 1].append(l[1 + i*3])
d[i + 1].append(l[2 + i*3])
for key, value in d.items():
print(f"{key} {value}")
This is what you are probably looking for
size = 3
l = len(board)
dict = {}
dictIndex = 1
for x in range(0, l, size):
for y in range(0, l, size):
a = []
for i in range(x, x + size):
for j in range(y, y + size):
a.append(board[i][j])
dict[dictIndex] = a
dictIndex += 1
print(dict)
Shorter variant
size, l, dict, dictIndex = 3, len(board), {}, 1
for x in range(0, l, size):
for y in range(0, l, size):
dict[dictIndex], dictIndex = [board[i][j] for i in range(x, x + size) for j in range(y, y + size)], dictIndex + 1
print(dict)
myBoard = [[0, 4, 0, 7, 0, 0, 1, 3, 0],
[0, 0, 2, 0, 0, 0, 6, 0, 0],
[0, 0, 0, 4, 2, 0, 0, 0, 0],
[6, 0, 0, 0, 0, 2, 0, 0, 3],
[2, 3, 1, 0, 7, 0, 0, 8, 0],
[4, 0, 0, 3, 1, 0, 0, 0, 0],
[0, 7, 0, 0, 0, 8, 0, 0, 0],
[0, 0, 6, 0, 3, 0, 0, 0, 4],
[8, 9, 0, 0, 5, 0, 0, 0, 6]]
how to get this like an input from the user
Try this.
myBoard = []
rows = 9
for x in range(rows):
line = input().split(',')
myBoard.append(line)
Or you can use list comprehension
rows = 9
myBoard = [input().split(',') for x in range(rows)]
You can take comma separated input like this
0,4,0,7,0,0,1,3,0
0,0,2,0,0,0,6,0,0
0,0,0,4,2,0,0,0,0
6,0,0,0,0,2,0,0,3
2,3,1,0,7,0,0,8,0
4,0,0,3,1,0,0,0,0
0,7,0,0,0,8,0,0,0
0,0,6,0,3,0,0,0,4
8,9,0,0,5,0,0,0,6
I'd like to take the difference of non-adjacent values within 2D numpy array along axis=-1 (per row). An array can consist of a large number of rows.
Each row is a selection of values along a timeline from 1 to N.
For N=12, the array could look like below 3x12 shape:
timeline = np.array([[ 0, 0, 0, 4, 0, 6, 0, 0, 9, 0, 11, 0],
[ 1, 0, 3, 4, 0, 0, 0, 0, 9, 0, 0, 12],
[ 0, 0, 0, 4, 0, 0, 0, 0, 9, 0, 0, 0]])
The desired result should look like: (size of array is intact and position is important)
diff = np.array([[ 0, 0, 0, 4, 0, 2, 0, 0, 3, 0, 2, 0],
[ 1, 0, 2, 1, 0, 0, 0, 0, 5, 0, 0, 3],
[ 0, 0, 0, 4, 0, 0, 0, 0, 5, 0, 0, 0]])
I am aware of the solution in 1D, Diff on non-adjacent values
imask = np.flatnonzero(timeline)
diff = np.zeros_like(timeline)
diff[imask] = np.diff(timeline[imask], prepend=0)
within which the last line can be replaced with
diff[imask[0]] = timeline[imask[0]]
diff[imask[1:]] = timeline[imask[1:]] - timeline[imask[:-1]]
and the first line can be replaced with
imask = np.where(timeline != 0)[0]
Attempting to generalise the 1D solution I can see imask = np.flatnonzero(timeline) is undesirable as rows becomes inter-dependent. Thus I am trying by using the alternative np.nonzero.
imask = np.nonzero(timeline)
diff = np.zeros_like(timeline)
diff[imask] = np.diff(timeline[imask], prepend=0)
However, this solution results in a connection between row's end values (inter-dependent).
array([[ 0, 0, 0, 4, 0, 2, 0, 0, 3, 0, 2, 0],
[-10, 0, 2, 1, 0, 0, 0, 0, 5, 0, 0, 3],
[ 0, 0, 0, -8, 0, 0, 0, 0, 5, 0, 0, 0]])
How can I make the "prepend" to start each row with a zero?
Wow. I did it... (It is interesting problem for me too..)
I made non_adjacent_diff function to be applied to every row, and apply it to every row using np.apply_along_axis.
Try this code.
timeline = np.array([[ 0, 0, 0, 4, 0, 6, 0, 0, 9, 0, 11, 0],
[ 1, 0, 3, 4, 0, 0, 0, 0, 9, 0, 0, 12],
[ 0, 0, 0, 4, 0, 0, 0, 0, 9, 0, 0, 0]])
def non_adjacent_diff(row):
not_zero_index = np.where(row != 0)
diff = row[not_zero_index][1:] - row[not_zero_index][:-1]
np.put(row, not_zero_index[0][1:], diff)
return row
np.apply_along_axis(non_adjacent_diff, 1, timeline)
I'm going to do the following operation of a list or numpy array:
[0, 0, 0, 1, 0, 0, 4, 2, 0, 7, 0, 0, 0]
move all non-zeros to the right side:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 4, 2, 7]
How can I do this efficiently?
Thanks
============
Sorry I didn't make it clear, I need the order of non-zeros elements remains.
You could sort the list by their boolean value. All falsy values (just zero for numbers) will get pushed to the front of the list. Python's builtin sort appears stable, so other values will keep their relative position.
Example:
>>> a = [0, 0, 0, 1, 0, 0, 5, 2, 0, 7, 0, 0, 0]
>>> sorted(a, key=bool)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 5, 2, 7]
Using NumPy:
>>> a = np.array([0, 0, 0, 1, 0, 0, 4, 2, 0, 7, 0, 0, 0])
>>> np.concatenate((a[a==0], a[a!=0]))
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 4, 2, 7])
You can do this in O(N) time in Python as well by using a simple for-loop. But will take some extra memory which we can prevent in #grc's solution by using a.sort(key=bool):
>>> from collections import deque
#Using a deque
>>> def solve_deque(lst):
d = deque()
append_l = d.appendleft
append_r = d.append
for x in lst:
if x:
append_r(x)
else:
append_l(x)
return list(d) #Convert to list if you want O(1) indexing.
...
#Using simple list
>>> def solve_list(lst):
left = []
right = []
left_a = left.append
right_a = right.append
for x in lst:
if x:
right_a(x)
else:
left_a(x)
left.extend(right)
return left
>>> solve_list([0, 0, 0, 1, 0, 0, 4, 2, 0, 7, 0, 0, 0])
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 4, 2, 7]
>>> solve_deque([0, 0, 0, 1, 0, 0, 4, 2, 0, 7, 0, 0, 0])
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 4, 2, 7]
I have a list of lists and I want to be able to refer to the 1st, 2nd, 3rd, etc. column in a list of lists. Here is my code for the list:
matrix = [
[0, 0, 0, 5, 0, 0, 0, 0, 6],
[8, 0, 0, 0, 4, 7, 5, 0, 3],
[0, 5, 0, 0, 0, 3, 0, 0, 0],
[0, 7, 0, 8, 0, 0, 0, 0, 9],
[0, 0, 0, 0, 1, 0, 0, 0, 0],
[9, 0, 0, 0, 0, 4, 0, 2, 0],
[0, 0, 0, 9, 0, 0, 0, 1, 0],
[7, 0, 8, 3, 2, 0, 0, 0, 5],
[3, 0, 0, 0, 0, 8, 0, 0, 0],
]
I want to be able to say something like:
matrix = [
[0, 0, 0, 5, 0, 0, 0, 0, 6],
[8, 0, 0, 0, 4, 7, 5, 0, 3],
[0, 5, 0, 0, 0, 3, 0, 0, 0],
[0, 7, 0, 8, 0, 0, 0, 0, 9],
[0, 0, 0, 0, 1, 0, 0, 0, 0],
[9, 0, 0, 0, 0, 4, 0, 2, 0],
[0, 0, 0, 9, 0, 0, 0, 1, 0],
[7, 0, 8, 3, 2, 0, 0, 0, 5],
[3, 0, 0, 0, 0, 8, 0, 0, 0],
]
if (The fourth column in this matrix does not have any 1's in it):
(then do something)
I want to know what the python syntax would be for the stuff in parenthesis.
The standard way to perform what you asked is to do a list comprehension
if (The fourth column in this matrix does not have any 1's in it):
translates in:
>>>if not any([1 == row[3] for row in matrix])
However, depending on how often you need to perform this operation, how big is your matrix, etc... you might wish to look into numpy as it is easier (and remarkably faster) to address columns. An example:
>>> import numpy as np
>>> matrix = np.random.randint(0, 10, (5, 5))
>>> matrix
array([[3, 0, 9, 9, 3],
[5, 7, 7, 7, 6],
[5, 4, 6, 2, 2],
[1, 3, 5, 0, 5],
[3, 9, 7, 8, 6]])
>>> matrix[..., 3] #fourth column
array([9, 7, 2, 0, 8])
Try this:
if all(row[3] != 1 for row in matrix):
# do something
The row[3] part takes a look at the fourth element of a row, the for row in matrix part looks at all the rows in the matrix - this produces a list with all the fourth elements in all the rows, that is, the whole fourth column. Now if it is true for all the elements in the fourth column that they're different from one, then the condition is satisfied and you can do what you need inside the if.
A more traditional approach would be:
found_one = False
for i in xrange(len(matrix)):
if matrix[i][3] == 1:
found_one = True
break
if found_one:
# do something
Here I'm iterating over all the rows (i index) of the fourth column (3 index), and checking if an element is equal to one: if matrix[i][3] == 1:. Notice that the for cycle goes from the 0 index up to the "height" of the matrix minus one, that's what the xrange(len(matrix)) part says.
if 1 in [row[3] for row in matrix]: