Related
I'm working on a time-series problem, and I have a list of events such that each data point represent several objects being pulled from an inventory.
Each time the value reaches below some threshold, I want to add a constant number to the inventory.
For example, I want:
(threshold = 55, constant = 20)
70 60 50 45 30 0 -5 -75
to become:
70 60 70 65 70 60 75 25
Is there a "pythonic" way (pandas, numpy, etc...) to do it with no loops?
Edit: the addition of constant can occur multiple times, and only effect the future (i.e indexes that are greater than the observed index). This is the code I'm using right now, and my goal is to lose the for loop:
threshold = 55
constant = 20
a = np.array([70, 60, 50, 45, 30, 0, -5, -75])
b = a.copy()
for i in range(len(b)):
if b[i] <= threshold:
temp_add_array = np.zeros(b.shape)
indexes_to_add = np.array(range(len(b))) >= i
temp_add_array[indexes_to_add] += constant
b += temp_add_array.astype(int)
print(b)
print('*************')
print('[70 60 70 65 70 60 75 25]')
Since you're allowing for numpy:
>>> import numpy as np
# threshold and constant
>>> t, c = 55, 20
>>> data = np.asarray([70, 60, 50, 45, 30, 0, -5, -75])
# if you allow for data == threshold
>>> np.where(data >= t, data, data + c*((t-1-data) // c + 1))
array([70, 60, 70, 65, 70, 60, 55, 65])
# if you enforce data > threshold
>>> np.where(data > t, data, data + c*((t-data) // c + 1))
array([70, 60, 70, 65, 70, 60, 75, 65])
But there is really no need for an external dependency for a task like that
# threshold and constant
>>> t, c = 55, 20
>>> data = [70, 60, 50, 45, 30, 0, -5, -75]
# if you allow for data == threshold
>>> [x if x >= t else x + c*((t-1-x)//c + 1) for x in data]
[70, 60, 70, 65, 70, 60, 55, 65]
# if you enforce data > threshold
>>> [x if x > t else x + c*((t-x)//c + 1) for x in data]
[70, 60, 70, 65, 70, 60, 75, 65]
Edit of OP
I doubt there's a (readable) solution for your problem without using a loop; best thing I could come up with:
>>> import numpy as np
>>> a = np.asarray([70, 60, 50, 45, 30, 0, -5, -75])
# I don't think you *can* get rid of the loop since there are forward dependencies in the the data
>>> def stock_inventory(data: np.ndarray, threshold: int, constant: int) -> np.ndarray:
... res = data.copy()
... for i, e in enumerate(res):
... if e <= threshold:
... res[i:] += constant
... return res
...
>>> stock_inventory(a, threshold=55, constant=20)
array([70, 60, 70, 65, 70, 60, 75, 25])
Assuming a numpy ndarray...
original array is named a
subtract the threshold value from a - name the result b
make a boolean array of b < 0 - name this array c
integer/floor divide b by -1 * constant - name this d (it could be named b as it is no longer needed)
add one to d - name this e
use c as a boolean index to add e to a for those values that were less than the threshold. a[c] += e[c]
I have two different community structure, but the nodes are the same. Both community structures are stored in a dictionary(key: name of community (string) ; value: nodes in this community (int list)) like this:
communities_map_friendship:
C0:[0, 20, 48, 55, 60, 68, 79, 81, 85, ..., 78190]
C1:[1, 6, 10, 13, 18, 19, 22, 24, 26, ..., 78180]
C2:[7, 21, 25, 29, 36, 37, 42, 49, 70, ..., 78146]
C3:[40, 86, 103, 123, 129, 143, 154, 167, ..., 78172]
C4:[66, 83, 133, 169, 174, 175, 205, 237, ..., 78166]
C5:[179, 182, 188, 219, 228, 248, 265, 286, ..., 77981]
community_map_uservotes:
C0:[0, 20, 41, 48, 55, 60, 68, 79, 81, 85, ..., 78190]
C1:[1, 6, 10, 13, 18, 19, 24, 26, 28, 30, 31, ..., 78173]
C2:[22, 39, 43, 47, 53, 61, 69, 73, 97, 102, ..., 78180]
C3:[7, 21, 25, 29, 36, 37, 42, 49, 70, 80, 83, ..., 78166]
C4:[183, 483, 608, 1453, 2205, 2957, 3090, 3378, ..., 78149]
My goal is to count the cases when two different nodes are in on of the community lists in both structures. (e.g.: (0,20), (0,48), (20,48), ..., (1,6),(1,10),(6,10), ..., (7,21),...). It's important that is not required to be the same community. For example the nodes 7 and 21 are in C2 community in the first structure, but in C3 community in the second structure, but this pair should be included in the same way.
What I have already tried:
# Return true, if the two nodes are in the same community, otherwise return false
def Is_In_Same_Community(node1, node2, community_map):
for community in community_map.values():
if((node1 in community) and (node2 in community)):
return True
elif(((node1 in community) and (node2 not in community)) or ((node1 not in community) and (node2 in community))):
return False
return False
#The algorithm, which counts the appropriate value:
TP=0
for community in communities_map_friendship.values():
res = [Is_In_Same_Community(x,y,communities_map_uservotes)
for i,x in enumerate(community) for j,y in enumerate(community) if i != j]
TP = TP + res.count(True)
The algorithm is good, but the problem is that I have around 30.000 nodes, so it would run for days until I got the proper value.
Does anyone have an idea to speed up this algorithm somehow?
This shouldn't take days for 30000 nodes, and the while loop could still be optimized some:
def count_pairs( cm1, cm2 ):
count = 0
for k,l1 in cm1.items():
if k not in cm2:
continue
l2 = cm2[k]
i1 = i2 = 0
common = []
while i1 < len(l1) and i2 < len(l2):
v1 = l1[i1]
v2 = l2[i2]
if v1 < v2:
i1 += 1
elif v1 > v2:
i2 += 1
else:
common.append(v1)
i1 += 1
i2 += 1
count += len(common)*(len(common)+1)/2
return count
Taking the latest version of the question into account:
def count_pairs( cm1, cm2 ):
count = 0
for k,l1 in cm1.items():
for k2,l2 in cm2.items():
i1 = i2 = 0
common = []
while i1 < len(l1) and i2 < len(l2):
v1 = l1[i1]
v2 = l2[i2]
if v1 < v2:
i1 += 1
elif v1 > v2:
i2 += 1
else:
common.append(v1)
i1 += 1
i2 += 1
count += len(common)*(len(common)+1)/2
return count
There's a different way to approach this. Consider two lists:
l1 = [0, 20, 48, 55]
l2 = [0, 20, 41, 48, 60]
The shared pairs between these lists are just the permutations (or combinations if you don't want (0, 20) and (20, 0) to be distinct) of the shared members. For example the intersection of these lists are:
set(l1) & set(l2)
# {0, 20, 48}
So the shared pairs are (0, 20), (20, 0), (0, 48), (48, 0), (20, 48), (48, 20)
If you only care about the count, then you don't even need to worry about figuring out those pairs because we know the number of pairs is determined by the formula:
(n!)/(n -2)!
With that in mind you can just take the product of the keys from each list and add the count of the permutations of shared nodes:
from itertools import product
import math
mf = {
"C0":[0, 20, 48, 55],
"C1":[1, 6, 10, 13],
"C2":[7, 21, 25, 55],
}
mu = {
"C0":[0, 20, 41, 48, 60],
"C1":[1, 6, 10, 13, 18],
"C3":[7, 21, 25, 29],
}
TP = 0
for p1, p2 in product(mf.values(), mu.values()):
num_common = len(set(p1) & set(p2))
if num_common >= 2:
TP += math.factorial(num_common)//math.factorial((num_common - 2))
print(TP) # 24
Which is the same answer you get with your code.
If I have a list
l = [0,1,2,3,4,5,0,23,34,0,45,0,21,58,98,76,68,0]
I want to replace all the 0 with an increment value starting from the highest value in the list l. So in this case the highest value is 98 so the 0 should be replace with 99,100,101,102 and 103.
This is my solution and it works fine
for ix,i in enumerate(l):
m = max(l)
if i == 0:
l[ix] = (m+1)
But I would like to know what is the best way to solve this problem.
You can use a list comphrension, incrementing via itertools.count():
>>> from itertools import count
>>> lst = [0,1,2,3,4,5,0,23,34,0,45,0,21,58,98,76,68,0]
>>> c = count(max(lst)+1)
>>> [x or next(c) for x in lst]
[99, 1, 2, 3, 4, 5, 100, 23, 34, 101, 45, 102, 21, 58, 98, 76, 68, 103]
E.g. For the input 5, the output should be 7.
(bin(1) = 1, bin(2) = 10 ... bin(5) = 101) --> 1 + 1 + 2 + 1 + 2 = 7
Here's what I've tried, but it isn't a very efficient algorithm, considering that I iterate the loop once for each integer. My code (Python 3):
i = int(input())
a = 0
for b in range(i+1):
a = a + bin(b).count("1")
print(a)
Thank you!
Here's a solution based on the recurrence relation from OEIS:
def onecount(n):
if n == 0:
return 0
if n % 2 == 0:
m = n/2
return onecount(m) + onecount(m-1) + m
m = (n-1)/2
return 2*onecount(m)+m+1
>>> [onecount(i) for i in range(30)]
[0, 1, 2, 4, 5, 7, 9, 12, 13, 15, 17, 20, 22, 25, 28, 32, 33, 35, 37, 40, 42, 45, 48, 52, 54, 57, 60, 64, 67, 71]
gmpy2, due to Alex Martella et al, seems to perform better, at least on my Win10 machine.
from time import time
import gmpy2
def onecount(n):
if n == 0:
return 0
if n % 2 == 0:
m = n/2
return onecount(m) + onecount(m-1) + m
m = (n-1)/2
return 2*onecount(m)+m+1
N = 10000
initial = time()
for _ in range(N):
for i in range(30):
onecount(i)
print (time()-initial)
initial = time()
for _ in range(N):
total = 0
for i in range(30):
total+=gmpy2.popcount(i)
print (time()-initial)
Here's the output:
1.7816979885101318
0.07404899597167969
If you want a list, and you're using >Py3.2:
>>> from itertools import accumulate
>>> result = list(accumulate([gmpy2.popcount(_) for _ in range(30)]))
>>> result
[0, 1, 2, 4, 5, 7, 9, 12, 13, 15, 17, 20, 22, 25, 28, 32, 33, 35, 37, 40, 42, 45, 48, 52, 54, 57, 60, 64, 67, 71]
I want to make a program on Python that will add 5 per count until count is 20, so the total would be 100. So basically I want to show the result of 5 * 20 using this way.
num = 5
count = 0
total = 0
I tried this code but it returns as zero. Why?
while(count == 20):
total = num * count
if(total == num * count):
count = count + 1
print total
Please fix any mistake I made. I'm new to Python...
You probably meant while count <= 20:
The condition specified for a while loop is what needs to be true for it to keep running - not for when it ends.
Also note that you don't need parentheses around the while and if conditions.
Your code also has some odd redundancies, though.
For instance:
total = num * count
if total == num * count:
count = count + 1
The if statement will always be true, given that you're setting total to the same thing you check it against, in the previous line. In other words, you could have just written...
total = num * count
if True:
count = count + 1
or even just...
total = num * count
count = count + 1
Furthermore...
You set total equal to num * count on each iteration, but if your goal is just to print out num * 20, you don't have to count up to 20 - you can just start out at 20.
num = 5
count = 20
print num * count
Also note...
that this line can be more concisely stated:
count = count + 1
can also just be written as...
count += 1
Finally...
If what you really wanted was a list of numbers in increments of 5 up to 100, you could do this:
>>> range(0, 101, 5)
[0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100]
or this:
>>> [n*5 for n in range(21)]
[0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100]