How do I improve the run time to receive the training samples? - python

I have the following problems the following code snippet is correct only this one is slow. For example, I need 10 minutes to go through this. However, I don't know how to speed it up. Does somebody has any idea?
#Dataframe:
list_l = [[0, 3, 8, 0], [0, 8, 7, 0], [0, 2, 9, 0], [1, 10, 10, 1], [2, 3, 8, 2], [2, 10, 10, 2], [3, 4, 12, 3], [3, 12, 4, 3], [3, 3, 8, 3], [4, 12, 4, 4], [4, 3, 8, 4], [4, 4, 12, 4], [5, 8, 7, 5], [5, 6, 13, 5], [5, 3, 8, 5], [6, 0, 3, 6], [6, 5, 11, 6], [6, 12, 4, 6], [7, 9, 6, 7], [7, 9, 6, 7], [8, 13, 5, 8], [9, 1, 0, 9], [9, 7, 2, 9], [9, 11, 1, 9], [9, 11, 1, 9]]
# Note: location isn't relevant
df = DataFrame (list_l ,columns=['buyerid','itemid', 'group', 'location'])
#train_mat:
# trainmat
# How have to generate train_mat by yourself
# df_main = complete dataframe, data = splitted dataframe (complete dataframe is also ok)
def generate_matrix(df_main,data):
num_users = df_main["buyerid"].nunique()
num_items = df_main["itemid"].nunique()
print(num_users)
print(num_items)
mat = sp.dok_matrix((num_users, num_items), dtype=np.float32)
for buyerid, itemidin zip(data['buyerid'], data['itemid']):
mat[buyerid, itemid] = 1.0
print(mat)
return mat
#num_negatives:
# num_negatives = 4
Code:
# allData = complete Dataframe, train_mat = one hot encoding matrix, num_negatives = integer
def get_train_samples(allData, train_mat, num_negatives):
user_input, item_input, labels = [], [], []
num_user, num_item = train_mat.shape
for (u, i) in train_mat.keys():
user_input.append(u)
item_input.append(i)
labels.append(1)
# negative instances
for t in range(num_negatives):
j = np.random.randint(num_item)
if allData.loc[(allData['buyerid'] == u)&(allData['itemid'] == i)].empty:
j = np.random.randint(num_item)
user_input.append(u)
item_input.append(j)
labels.append(0)
return user_input, item_input, labels

some things to improve here:
for t in range(num_negatives):
j = np.random.randint(num_item)
if allData.loc[(allData['buyerid'] == u)&(allData['itemid'] == i)].empty:
j = np.random.randint(num_item)
you essentially say "j is a random number, if some stuff is true, set j to a random number". Isn't that just "j is a random number" with extra steps?
I'd say this does the same but faster
for t in range(num_negatives):
j = np.random.randint(num_item)
edit: removing brain fart

Related

Splitting list of elements without numpy array function [duplicate]

This question already has answers here:
How do I split a list into equally-sized chunks?
(66 answers)
Closed 1 year ago.
Example: I have a list:
[8, 3, 4, 1, 5, 9, 6, 7, 2]
And I need to make it look like this but without using numpy.array_split():
[[8, 3, 4], [1, 5, 9], [6, 7, 2]]
How can I do it? Not only for this one case, but when I have 4 elements, I want to have 2 and 2, (9 - 3,3,3 and 16 - 4,4,4,4) etc.
You can get the square root of the list's length then split it using a list comprehension. This will work for lists with the length of 4, 9, 16, ...:
lst = [8, 3, 4, 1, 5, 9, 6, 7, 2]
lst2 = [8, 3, 4, 1]
def split_equal(lst):
len_ = len(lst)
# returns emtpy list, if the list has no item.
if len_ == 0:
return []
n = int(len_ ** 0.5)
return [lst[i:i + n] for i in range(0, len_, n)]
output:
[[8, 3, 4], [1, 5, 9], [6, 7, 2]]
[[8, 3], [4, 1]]
You can use that:
def splitter(inlist):
n = len(inlist)
m = int(n ** 0.5)
if m*m != n:
raise Exception("")
return [[inlist[i+j] for j in range(m)] for i in range(m)]
print(splitter([8, 3, 4, 1]))
print(splitter([8, 3, 4, 1, 5, 9, 6, 7, 2]))
print(splitter([8, 3, 4, 1, 5, 9, 6, 7, 2, 8, 3, 4, 1, 5, 9, 6]))
Result:
[[8, 3], [3, 4]]
[[8, 3, 4], [3, 4, 1], [4, 1, 5]]
[[8, 3, 4, 1], [3, 4, 1, 5], [4, 1, 5, 9], [1, 5, 9, 6]]
Carefull, it will crash if the square of the len of input list is not integer.
def equal_array_split(arr, split_arr_len):
array_length = len(arr)
if array_length % split_arr_len == 0:
return [arr[i:i+split_arr_len] for i in range(0,array_length,split_arr_len)]
else:
return "Invalid split array length!!"
print(equal_array_split([1,2,3,4,5,6,7,8,9],3))
print(equal_array_split([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],4))
print(equal_array_split([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],8))
print(equal_array_split([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],2))
Output:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
[[1, 2, 3, 4, 5, 6, 7, 8], [9, 10, 11, 12, 13, 14, 15, 16]]
[[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
You can slice the list with a list comprehension. Assuming the input is a square number length:
import numpy as np
arr = [8, 3, 4, 1, 5, 9, 6, 7, 2]
n = int(np.sqrt(len(arr)))
result = [arr[i*n:(i+1)*n] for i in range(int(n))]
the split value being a square
list = [8, 3, 4, 1, 5, 9, 6, 7, 2]
result = []
N = split_value #split_value being the value required to split the list
for i in range(0,len(list),N):
result.append(list[i:i+N])
print(result)
Whitout numpy....
a = [8, 3, 4, 1, 5, 9, 6, 7, 2]
splitedSize = 3
a_splited = [a[x:x+splitedSize] for x in range(0, len(a), splitedSize)]
print(a_splited)

extend item then remove item in nested list in python [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I am trying to classification a nested list in python, I have huge list and I haven't succeeded yet because of item index error. My goal is that if two items in list have same member then extend item1 with item2 and remove item2. I don't have enough experience on python. I hope you can help
My pseudo code
L = [[0, 1], [2, 3], [4, 5, 13], [6, 7], [2, 8],[3, 10, 11], [12, 13]]
for i in range(len(L)-1):
for j in range(i+1,len(L)):
if i!=j and set(L[i]) & set(L[j]) != set():
L[i].extend(L[j])
L.remove(L[j])
expected L = [[0,1], [6, 7], [2, 3, 2, 8, 3, 10, 11], [4, 5, 13, 12, 13]]
L = [[0, 1], [2, 3], [4, 5, 13], [6, 7], [2, 8],[3, 10, 11], [12, 13]]
out = []
while L:
current = L.pop(0)
out.append(current)
tmp = []
for v in L:
if set(v).intersection(current):
current.extend(v)
else:
tmp.append(v)
L = tmp
print(out)
Prints:
[[0, 1], [2, 3, 2, 8, 3, 10, 11], [4, 5, 13, 12, 13], [6, 7]]
EDIT: Version 2:
L = [[0, 1], [2, 3], [4, 5], [6, 7], [8, 9],[10, 11], [1,3,5,7,9,11]]
out = []
while L:
current = L[0]
while True:
tmp = []
for i, v in enumerate(L[1:], 1):
if set(v).intersection(current):
current.extend(L.pop(i))
break
else:
tmp.append(v)
else:
break
out.append(current)
L = tmp
print(out)
Prints:
[[0, 1, 1, 3, 5, 7, 9, 11, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]]
For L = [[0, 1], [2, 3], [4, 5, 13], [6, 7], [2, 8],[3, 10, 11], [12, 13]] prints:
[[0, 1], [2, 3, 2, 8, 3, 10, 11], [4, 5, 13, 12, 13], [6, 7]]

How to make a random sample of chunk in an array/list (with replacement) instead of sampling the individual elements in the list list with python

I able to split a list into array of chunks from a list as demonstrated in the bellow python code:
def split_list(the_list, chunk_size):
result_list = []
while the_list:
result_list.append(the_list[:chunk_size])
the_list = the_list[chunk_size:]
return result_list
a_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print split_list(a_list, 3)
which yield the bellow result of array of chunks:
# [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]
I am also aware of making a random sample through numpy.random.choice (even with replacement) as demonstrated bellow:
import numpy as np
a_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
np.random.choice(a_list, size=20,
replace=True)
which yields the bellow result:
#array([ 6, 9, 4, 9, 1, 1, 6, 10, 8, 5, 10, 6, 2, 6, 7, 1, 3,
2, 7, 6])
What I want
I want to sample chunk in the array (while the elements of each chunk is left as it is) with replacement.
I am looking forward to get a code that will produce something like this:
# [[7, 8, 9], [1, 2, 3], [4, 5, 6], [7, 8, 9], [10], [1, 2, 3], [1, 2, 3], [10], [1, 2, 3], [7, 8, 9], [1, 2, 3], [1, 2, 3], [10], [4, 5, 6], [4, 5, 6], [10], [10], [7, 8, 9],, [1, 2, 3], [7, 8, 9]]
I picked the above sample of chunk myself, I need help to get a working python code to do that for me.
You can determine the number of different chunks in your list (4 in your example), then randomly choose the index of the one you want (between 0 and 3 in your example).
So, you could do:
import math
import random
def random_chunk(lst, chunk_size):
nb_chunks = int(math.ceil(len(lst)/chunk_size))
choice = random.randrange(nb_chunks) # 0 <= choice < nb_chunks
return lst[choice*chunk_size:(choice+1)*chunk_size]
a_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
out = [random_chunk(a_list, chunk_size=3) for _ in range(20)]
print(out)
# [[10], [7, 8, 9], [4, 5, 6], [4, 5, 6], [1, 2, 3], [7, 8, 9], [7, 8, 9], [4, 5, 6],
# [10], [7, 8, 9], [10], [10], [10], [7, 8, 9], [10], [10], [10], [4, 5, 6], [4, 5, 6], [10]]

Forming a magic square

We define a magic square to be an matrix of distinct positive integers from to where the sum of any row, column, or diagonal of length is always equal to the same number: the magic constant.
You will be given a matrix of integers in the inclusive range . We can convert any digit to any other digit in the range at cost of . Given , convert it into a magic square at minimal cost. Print this cost on a new line.
Note: The resulting magic square must contain distinct integers in the inclusive range .
For example, we start with the following matrix :
5 3 4
1 5 8
6 4 2
We can convert it to the following magic square:
8 3 4
1 5 9
6 7 2
This took three replacements at a cost of .
5-8 + 8-9 + 4-7 = 7
I Have Write a programm to slove this but i get the incorrect result when i try to run it.
def formingMagicSquare(s):
arr=[]
duplicates=[]
totaldifference=0
for i in range(0,len(s)):
linesum=sum(s[i])
for j in range(0,len(s[i])):
if(s[i][j] in arr and linesum!=15):
duplicates.append(i*10+j)
else:
arr.append(s[i][j])
for i in range(0,len(duplicates)):
iarr = duplicates[i]//10
jarr = duplicates[i]%10
linesum=sum(s[i])
difference=15-linesum
totaldifference = totaldifference + difference
return totaldifference
if __name__ == '__main__':
fptr = open(os.environ['OUTPUT_PATH'], 'w')
s = []
for _ in range(3):
s.append(list(map(int, input().rstrip().split())))
result = formingMagicSquare(s)
fptr.write(str(result) + '\n')
fptr.close()
class Magic(object):
pre = [
[[8, 1, 6], [3, 5, 7], [4, 9, 2]],
[[6, 1, 8], [7, 5, 3], [2, 9, 4]],
[[4, 9, 2], [3, 5, 7], [8, 1, 6]],
[[2, 9, 4], [7, 5, 3], [6, 1, 8]],
[[8, 3, 4], [1, 5, 9], [6, 7, 2]],
[[4, 3, 8], [9, 5, 1], [2, 7, 6]],
[[6, 7, 2], [1, 5, 9], [8, 3, 4]],
[[2, 7, 6], [9, 5, 1], [4, 3, 8]],
]
def evaluate(self, s):
totals = []
for p in self.pre:
total = 0
for p_row, s_row in zip(p, s):
for i, j in zip(p_row, s_row):
if not i == j:
total += max([i, j]) - min([i, j])
totals.append(total)
return min(totals)
def main():
s=[]
for _ in range(3):
s.append(list(map(int, input().rstrip().split())))
magic = Magic()
result = magic.evaluate(s)
print(result)
if __name__ == '__main__':
main()
Thank You I Have Write A new Code And I Have Change my code from the base.
I think you can easily try:
def forming_magic_square(s):
# Flaten s
s = list(itertools.chain.from_iterable(s))
magic_squares = [
[8, 1, 6, 3, 5, 7, 4, 9, 2],
[6, 1, 8, 7, 5, 3, 2, 9, 4],
[4, 9, 2, 3, 5, 7, 8, 1, 6],
[2, 9, 4, 7, 5, 3, 6, 1, 8],
[8, 3, 4, 1, 5, 9, 6, 7, 2],
[4, 3, 8, 9, 5, 1, 2, 7, 6],
[6, 7, 2, 1, 5, 9, 8, 3, 4],
[2, 7, 6, 9, 5, 1, 4, 3, 8],
]
costs = []
for magic_square in magic_squares:
costs.append(sum([abs(magic_square[i] - s[i]) for i in range(9)]))
return min(costs)

How to get k top elements of multidimensional array along a axis instead of one that argmax gives

I have a prediction in the format of np.argmax(model.predict(X),axis=2) which returns one element.How to predict top k elements using numpy
The link provided by #desertnaut covers the 1D case. It is, however, not entirely trivial to generalize the good answer to "ND along axis".
Here is an example where we find the top 2 along axis 1:
>>> a = np.random.randint(0, 9, (3, 5, 6))
>>> b = a.argpartition(-2, axis=1)[:, -2:]
>>> i, j, k = a.shape
>>> i, j, k = np.ogrid[:i, :j, :k]
>>> b = b[i, a[i, b, k].argsort(axis=1), k]
>>> a
array([[[8, 4, 1, 2, 4, 8],
[0, 1, 3, 4, 2, 7],
[4, 2, 7, 8, 1, 4],
[1, 6, 2, 0, 3, 7],
[1, 0, 0, 2, 8, 1]],
[[1, 6, 3, 3, 0, 6],
[7, 2, 0, 3, 8, 5],
[5, 0, 1, 1, 7, 4],
[2, 2, 4, 2, 6, 2],
[5, 5, 7, 6, 8, 1]],
[[4, 4, 4, 6, 2, 5],
[2, 7, 8, 2, 6, 0],
[5, 6, 7, 5, 1, 6],
[6, 5, 3, 2, 2, 3],
[5, 1, 8, 1, 6, 8]]])
>>> a[i, b, k]
array([[[4, 4, 3, 4, 4, 7],
[8, 6, 7, 8, 8, 8]],
[[5, 5, 4, 3, 8, 5],
[7, 6, 7, 6, 8, 6]],
[[5, 6, 8, 5, 6, 6],
[6, 7, 8, 6, 6, 8]]])
A general function could look like
def argtopk(A, k, axis=0):
tk = A.argpartition(-k, axis=axis)[(*axis*(slice(None),), slice(-k, None))]
I = np.ogrid[(*map(slice, A.shape),)]
I[axis] = tk
I[axis] = A[I].argsort(axis=axis)
return tk[I]

Categories