Is there a way to simulate the follow output using scipy.signal instead of loops?
import pandas as pd
df_in = pd.DataFrame({'Generated':[13,8,7,6],'Consume':[8,10,20,5]})
print(df_in)
Generated Consume
0 13 8
1 8 10
2 7 20
3 6 5
df_in['balance'] = [5,3,0,1]
Where 13 - 8 equals a balance of 5, the 5 is carried balance to the next line and 5+8-10 yeilds a balance of 3.
The three is carried to the next line, 3+7-10 yeilds a negative number, but you can't carry a negative balance.
So, the next line 0 carry + 6 - 5 yeilds 1 balance.
print(df_in)
Expected output:
Generated Consume balance
0 13 8 5
1 8 10 3
2 7 20 0
3 6 5 1
If it wasn't for the requirement to only add to carry if the balance is positive, you could use an accumulator on the difference. This accumulator can be implemented using lfilter, obtaining the b and a parameters from the recurrence equation y[n+1] = y[n] + x[n]:
x = df_in['Generated'] - df_in['Consume']
df_in['balance'] = scipy.signal.lfilter([1], [1,-1], x)
Unfortunately adding the carry only if the balance stays positive makes the process non-linear which scipy.signal.lfilter is not made to handle. At this point you'd have to resort to using a loop to handle the special case.
Related
I am trying to capture values from a kernel distribution which gives almost 0 and at the end of the tail. My try is to take values from the kernel function , distributed in a timeline from -120 to 120 and make the percentage change for the values from the kernel , so then i can declared an arbitrary rule that 10 consecutive negative changes and have which kernel value is almost 0 i can declare as the starting point from the ending of the curve.
Illustration example for which point of the kernel function i want to obtain.
in this case the final value which i will like to obtain is around 300
my dataframe looks like (this is not the same example values from above) :
df
id event_time
1 2
1 3
1 3
1 5
1 9
1 10
2 1
2 1
2 2
2 2
2 5
2 5
# my try
def find_value(df):
if df.shape[0] == 1:
return df.iloc[0].event_time
kernel = stats.gaussian_kde(df['event_time'])
time = list(range(-120,120))
a = kernel(time)
b = np.diff(a) / a[:-1] * 100
so far i have a which represent Y axis from the graph and b which represent the change in Y. The reason that i did this is for making the logic made at the begging but dont know how to code it. after writing the function i was thinking in using an groupby and a apply
I am researching how python implements dictionaries. One of the equations in the python dictionary implementation relates the pseudo random probing for an empty dictionary slot using the equation
j = ((j*5) + 1) % 2**i
which is explained here.
I have read this question, How are Python's Built In Dictionaries Implemented?, and basically understand how dictionaries are implemented.
What I don't understand is why/how the equation:
j = ((j*5) + 1) % 2**i
cycles through all the remainders of 2**i. For instance, if i = 3 for a total starting size of 8. j goes through the cycle:
0
1
6
7
4
5
2
3
0
if the starting size is 16, it would go through the cycle:
0 1 6 15 12 13 2 11 8 9 14 7 4 5 10 3 0
This is very useful for probing all the slots in the dictionary. But why does it work ? Why does j = ((j*5)+1) work but not j = ((j*6)+1) or j = ((j*3)+1) both of which get stuck in smaller cycles.
I am hoping to get a more intuitive understanding of this than the equation just works and that's why they used it.
This is the same principle that pseudo-random number generators use, as Jasper hinted at, namely linear congruential generators. A linear congruential generator is a sequence that follows the relationship X_(n+1) = (a * X_n + c) mod m. From the wiki page,
The period of a general LCG is at most m, and for some choices of factor a much less than that. The LCG will have a full period for all seed values if and only if:
m and c are relatively prime.
a - 1 is divisible by all prime factors of m.
a - 1 is divisible by 4 if m is divisible by 4.
It's clear to see that 5 is the smallest a to satisfy these requirements, namely
2^i and 1 are relatively prime.
4 is divisible by 2.
4 is divisible by 4.
Also interestingly, 5 is not the only number that satisfies these conditions. 9 will also work. Taking m to be 16, using j=(9*j+1)%16 yields
0 1 10 11 4 5 14 15 8 9 2 3 12 13 6 7
The proof for these three conditions can be found in the original Hull-Dobell paper on page 5, along with a bunch of other PRNG-related theorems that also may be of interest.
I have to do these for school and I don't know how to.
Write a function print_triangular_numbers(n) that prints out the first n triangular numbers (n is an input). A call to print_triangular_numbers(5) would produce the following output:
n result
1 1
2 3
3 6
4 10
5 15
A triangular number can be expressed as
n(n+1)/2
Thus, you need to build a simple loop, starting at 1 and going through your passed parameter:
def print_triangular_numbers(n):
for x in range(1,n+1):
print x, x * (x + 1) / 2
The for loop starts at 1 and goes through n+1 because range is not inclusive of the end point.
This outputs:
1 1
2 3
3 6
4 10
5 15
I have a pandas series of value_counts for a data set. I would like to plot the data with a color band (I'm using bokeh, but calculating the data band is the important part):
I hesitate to use the word standard deviation since all the references I use calculate that based on the mean value, and I specifically want to use the mode as the center.
So, basically, I'm looking for a way in pandas to start at the mode and return a new series that of value counts that includes 68.2% of the sum of the value_counts. If I had this series:
val count
1 0
2 0
3 3
4 1
5 2
6 5 <-- mode
7 4
8 3
9 2
10 1
total = sum(count) # example value 21
band1_count = 21 * 0.682 # example value ~ 14.3
This is the order they would be added based on an algorithm that walks the value count on each side of the mode and includes the higher of the two until the sum of the counts is > than 14.3.
band1_values = [6, 7, 8, 5, 9]
Here are the steps:
val count step
1 0
2 0
3 3
4 1
5 2 <-- 4) add to list -- eq (9,2), closer to (6,5)
6 5 <-- 1) add to list -- mode
7 4 <-- 2) add to list -- gt (5,2)
8 3 <-- 3) add to list -- gt (5,2)
9 2 <-- 5) add to list -- gt (4,1), stop since sum of counts > 14.3
10 1
Is there a native way to do this calculation in pandas or numpy? If there is a formal name for this study, I would appreciate knowing what it's called.
I have some equation for a type of step function which I obtained with wolfram alpha:
a_n = 1/8 (2 n+(-1)^n-(1+i) (-i)^n-(1-i) i^n+9)
Using in wolfram with any positive integer will yield me a positive integer result however when I try the following in python
import numpy as np
n = 5
i = complex(0,1)
a = (1/8)*((2*n)+(np.power(-1,n))-(1+i)*(np.power(-i,n))-(1-i)*(np.power(i,n))+9)
I'm always stuck with some real + imaginary part. I need to be able to obtain an integer output for a for use in other equations.
Maybe you want int(a.real) at the end.
Also be aware that by default 1/8 will be evaluated as 0 in Python 2.x
(1+i) (-i)^n+(1-i) i^n
is two times the real part of (1-i) i^n which is, for instance
2*cos(pi/2*n)-2*cos(pi/2*(n+1))
or as values
n 0 1 2 3 4 5 6 7 8
expression 2 2 -2 -2 2 2 -2 -2 2
this is subtracted from the alternating sequence to give
n 0 1 2 3 4 5 6 7 8
(-1)^n-expr -1 -3 3 1 -1 -3 3 1 -1
periodically with period 4
This can be computed avoiding all powers and saveguarding for negative n as
3-2*(((n+2) mod 4 +4) mod 4)
adding 2n+9 to complete the expression gives
12+2*n-2*(((n+2) mod 4 +4) mod 4)
which is indeed divisible by 8, so
a = 1+(n+2-(((n+2) % 4 +4) % 4) )/4
now if one considers that this just reduces (n+2) to the next lower multiple of 4, this is equivalent to the simplified
a = 1 + (n+2)/4
using integer division.