Find entry exit signal from time series data - python

I have a small time-series data :
ser = pd.Series([2,3,4,5,6,0,8,7,1,3,4,0,6,4,0,2,4,0,4,5,0,1,7,0,1,8,5,3,6])
let's say if we choose a threshold of 5 to enter the market and zero to exit
I am trying to write a program which will generate an output like this :
so far I have used numba but still working on logic can you please help.
#numba.vectorize
def check_signal(x,t):
if x >= t :
y = 2
if x < t :
y =1
if x == 0:
y = -1
else :
y = y
return y

Why would you use numba unless you had tens of millions of these samples?
states = ["Entered market", "inside market", "market exit", "outside market"]
state = 2
fout = open('seriesdata.csv','w')
print("Time,Percent_change,Signal,Timestamp", file=fout)
for pct in ser:
stamp = ''
if state == 1 and pct == 0:
state = 2
stamp = str(len(data)+1)
elif state == 3 and pct >= 5:
state = 0
stamp = str(len(data)+1)
else if state in (0, 2):
state += 1
print(''.join((str(pct), states[state], stamp)), file=fout)
If you'd rather make a dataframe, just accumulate those values in a list and convert after.

Related

how to Pine script "cross" function to python?

https://www.tradingview.com/pine-script-reference/v5/#fun_ta%7Bdot%7Dcross
ta.cross(source1, source2) → series bool
RETURNS
true if two series have crossed each other, otherwise false.
ARGUMENTS
source1 (series int/float) First data series.
source2 (series int/float) Second data series.
Pine script
cross_1 = cross(longband[1], RSIndex)
trend := cross(RSIndex, shortband[1]) ? 1 : cross_1 ? -1 : nz(trend[1], 1)
FastAtrRsiTL = trend == 1 ? longband : shortband
to python
cross_1 = cross(longband[1], RSIndex)
if np.isclose[-1]( shortband[1], RSIndex):
trend = 1
elif cross_1 == True:
trend = -1
else:
trend = trend[1], 1
if trend == 1:
FastAtrRsiTL = longband
else:
FastAtrRsiTL = shortband
I need a cross function, but I don't know how to implement it.

How to properly add gradually increasing/decreasing space between objects?

I've trying to implement transition from an amount of space to another which is similar to acceleration and deceleration, except i failed and the only thing that i got from this was this infinite stack of mess, here is a screenshot showing this in action:
you can see a very black circle here, which are in reality something like 100 or 200 circles stacked on top of each other
and i reached this result using this piece of code:
def Place_circles(curve, circle_space, cs, draw=True, screen=None):
curve_acceleration = []
if type(curve) == tuple:
curve_acceleration = curve[1][0]
curve_intensity = curve[1][1]
curve = curve[0]
#print(curve_intensity)
#print(curve_acceleration)
Circle_list = []
idx = [0,0]
for c in reversed(range(0,len(curve))):
for p in reversed(range(0,len(curve[c]))):
user_dist = circle_space[curve_intensity[c]] + curve_acceleration[c] * p
dist = math.sqrt(math.pow(curve[c][p][0] - curve[idx[0]][idx[1]][0],2)+math.pow(curve [c][p][1] - curve[idx[0]][idx[1]][1],2))
if dist > user_dist:
idx = [c,p]
Circle_list.append(circles.circles(round(curve[c][p][0]), round(curve[c][p][1]), cs, draw, screen))
This place circles depending on the intensity (a number between 0 and 2, random) of the current curve, which equal to an amount of space (let's say between 20 and 30 here, 20 being index 0, 30 being index 2 and a number between these 2 being index 1).
This create the stack you see above and isn't what i want, i also came to the conclusion that i cannot use acceleration since the amount of time to move between 2 points depend on the amount of circles i need to click on, knowing that there are multiple circles between each points, but not being able to determine how many lead to me being unable to the the classic acceleration formula.
So I'm running out of options here and ideas on how to transition from an amount of space to another.
any idea?
PS: i scrapped the idea above and switched back to my master branch but the code for this is still available in the branch i created here https://github.com/Mrcubix/Osu-StreamGenerator/tree/acceleration .
So now I'm back with my normal code that don't possess acceleration or deceleration.
TL:DR i can't use acceleration since i don't know the amount of circles that are going to be placed between the 2 points and make the time of travel vary (i need for exemple to click circles at 180 bpm of one circle every 0.333s) so I'm looking for another way to generate gradually changing space.
First, i took my function that was generating the intensity for each curves in [0 ; 2]
Then i scrapped the acceleration formula as it's unusable.
Now i'm using a basic algorithm to determine the maximum amount of circles i can place on a curve.
Now the way my script work is the following:
i first generate a stream (multiple circles that need to be clicked at high bpm)
this way i obtain the length of each curves (or segments) of the polyline.
i generate an intensity for each curve using the following function:
def generate_intensity(Circle_list: list = None, circle_space: int = None, Args: list = None):
curve_intensity = []
if not Args or Args[0] == "NewProfile":
prompt = True
while prompt:
max_duration_intensity = input("Choose the maximum amount of curve the change in intensity will occur for: ")
if max_duration_intensity.isdigit():
max_duration_intensity = int(max_duration_intensity)
prompt = False
prompt = True
while prompt:
intensity_change_odds = input("Choose the odds of occurence for changes in intensity (1-100): ")
if intensity_change_odds.isdigit():
intensity_change_odds = int(intensity_change_odds)
if 0 < intensity_change_odds <= 100:
prompt = False
prompt = True
while prompt:
min_intensity = input("Choose the lowest amount of spacing a circle will have: ")
if min_intensity.isdigit():
min_intensity = float(min_intensity)
if min_intensity < circle_space:
prompt = False
prompt = True
while prompt:
max_intensity = input("Choose the highest amount of spacing a circle will have: ")
if max_intensity.isdigit():
max_intensity = float(max_intensity)
if max_intensity > circle_space:
prompt = False
prompt = True
if Args:
if Args[0] == "NewProfile":
return [max_duration_intensity, intensity_change_odds, min_intensity, max_intensity]
elif Args[0] == "GenMap":
max_duration_intensity = Args[1]
intensity_change_odds = Args[2]
min_intensity = Args[3]
max_intensity = Args[4]
circle_space = ([min_intensity, circle_space, max_intensity] if not Args else [Args[0][3],circle_space,Args[0][4]])
count = 0
for idx, i in enumerate(Circle_list):
if idx == len(Circle_list) - 1:
if random.randint(0,100) < intensity_change_odds:
if random.randint(0,100) > 50:
curve_intensity.append(2)
else:
curve_intensity.append(0)
else:
curve_intensity.append(1)
if random.randint(0,100) < intensity_change_odds:
if random.randint(0,100) > 50:
curve_intensity.append(2)
count += 1
else:
curve_intensity.append(0)
count += 1
else:
if curve_intensity:
if curve_intensity[-1] == 2 and not count+1 > max_duration_intensity:
curve_intensity.append(2)
count += 1
continue
elif curve_intensity[-1] == 0 and not count+1 > max_duration_intensity:
curve_intensity.append(0)
count += 1
continue
elif count+1 > 2:
curve_intensity.append(1)
count = 0
continue
else:
curve_intensity.append(1)
else:
curve_intensity.append(1)
curve_intensity.reverse()
if curve_intensity.count(curve_intensity[0]) == len(curve_intensity):
print("Intensity didn't change")
return circle_space[1]
print("\n")
return [circle_space, curve_intensity]
with this, i obtain 2 list, one with the spacing i specified, and the second one is the list of randomly generated intensity.
from there i call another function taking into argument the polyline, the previously specified spacings and the generated intensity:
def acceleration_algorithm(polyline, circle_space, curve_intensity):
new_circle_spacing = []
for idx in range(len(polyline)): #repeat 4 times
spacing = []
Length = 0
best_spacing = 0
for p_idx in range(len(polyline[idx])-1): #repeat 1000 times / p_idx in [0 ; 1000]
# Create multiple list containing spacing going from circle_space[curve_intensity[idx-1]] to circle_space[curve_intensity[idx]]
spacing.append(np.linspace(circle_space[curve_intensity[idx]],circle_space[curve_intensity[idx+1]], p_idx).tolist())
# Sum distance to find length of curve
Length += abs(math.sqrt((polyline[idx][p_idx+1][0] - polyline[idx][p_idx][0]) ** 2 + (polyline [idx][p_idx+1][1] - polyline[idx][p_idx][1]) ** 2))
for s in range(len(spacing)): # probably has 1000 list in 1 list
length_left = Length # Make sure to reset length for each iteration
for dist in spacing[s]: # substract the specified int in spacing[s]
length_left -= dist
if length_left > 0:
best_spacing = s
else: # Since length < 0, use previous working index (best_spacing), could also jsut do `s-1`
if spacing[best_spacing] == []:
new_circle_spacing.append([circle_space[1]])
continue
new_circle_spacing.append(spacing[best_spacing])
break
return new_circle_spacing
with this, i obtain a list with the space between each circles that are going to be placed,
from there, i can Call Place_circles() again, and obtain the new stream:
def Place_circles(polyline, circle_space, cs, DoDrawCircle=True, surface=None):
Circle_list = []
curve = []
next_circle_space = None
dist = 0
for c in reversed(range(0, len(polyline))):
curve = []
if type(circle_space) == list:
iter_circle_space = iter(circle_space[c])
next_circle_space = next(iter_circle_space, circle_space[c][-1])
for p in reversed(range(len(polyline[c])-1)):
dist += math.sqrt((polyline[c][p+1][0] - polyline[c][p][0]) ** 2 + (polyline [c][p+1][1] - polyline[c][p][1]) ** 2)
if dist > (circle_space if type(circle_space) == int else next_circle_space):
dist = 0
curve.append(circles.circles(round(polyline[c][p][0]), round(polyline[c][p][1]), cs, DoDrawCircle, surface))
if type(circle_space) == list:
next_circle_space = next(iter_circle_space, circle_space[c][-1])
Circle_list.append(curve)
return Circle_list
the result is a stream with varying space between circles (so accelerating or decelerating), the only issue left to be fixed is pygame not updating the screen with the new set of circle after i call Place_circles(), but that's an issue i'm either going to try to fix myself or ask in another post
the final code for this feature can be found on my repo : https://github.com/Mrcubix/Osu-StreamGenerator/tree/Acceleration_v02

extracting a wavelength of a sample wave produced with discrete data

in the following piece of code I've extracted a window of data off of an audio sample(1000Hz signal). In the code, I've tried to obtain a wavelength of the signal.
https://paste.pound-python.org/show/HRVqQNy3w9Sr73q4oY8g/
sample = data[100:200]
x = 0
i = 1
num_occur = 0
while num_occur <2:
if sample[i] == sample[0]:
x = i
i += 1
num_occur += 1
else:
i += 1
wavelen = sample[:x]
But with less success...
the image of the sample : (https://pasteboard.co/HFFXGxW.png)
Well, I do understand what the problem is; even though matplotlib plots the wave as a continuous wave(due to the high sampling frequency), the wave is made up of discrete data, so there may or may not be a data value satisfying:
sample[i] == sample[0]
I'll greatly appreciate any help and advice on how to get around this problem.
Someone enlightened me on how to get to the answer. It's a simple but logical approach.
So I just needed to extract the points that would cut 0, and the wave between the 1st and 3rd such consecutive points would give me a wavelength.
Here's the code I wrote:
sample = data[100:200]
i = 1
num_occur = 0
cut_zero = []
x = 0
while num_occur < 3:
if sample[x] == abs(sample[x]):
if sample[i] == abs(sample[i]):
i +=1
else:
cut_zero.append(i)
num_occur += 1
i += 1
x = i
elif sample[x] != abs(sample[x]):
if sample[i] == abs(sample[i]):
cut_zero.append(i)
i += 1
num_occur += 1
x = i
else:
i += 1
print(cut_zero)
a = cut_zero[0]
b = cut_zero[2]
wavelen = sample[a:b]
Maybe I could do it more efficiently :), if so let me know.
here's the image of the wavelength https://pasteboard.co/HFLtOmr.png

Comparing values in Python data frame efficiently

I'm trading daily on Cryptocurrencies and would like to find which are the most desirable Cryptos for trading.
I have CSV file for every Crypto with the following fields:
Date Sell Buy
43051.23918 1925.16 1929.83
43051.23919 1925.12 1929.79
43051.23922 1925.12 1929.79
43051.23924 1926.16 1930.83
43051.23925 1926.12 1930.79
43051.23926 1926.12 1930.79
43051.23927 1950.96 1987.56
43051.23928 1190.90 1911.56
43051.23929 1926.12 1930.79
I would like to check:
How many quotes will end with profit:
for Buy positions - if one of the following Sells > current Buy.
for Sell positions - if one of the following Buys < current Sell.
How much time it would take to a theoretical position to become profitable.
What can be the profit potential.
I'm using the following code:
#converting from OLE to datetime
OLE_TIME_ZERO = dt.datetime(1899, 12, 30, 0, 0, 0)
def ole(oledt):
return OLE_TIME_ZERO + dt.timedelta(days=float(oledt))
#variables initialization
buy_time = ole(43031.57567) - ole(43031.57567)
sell_time = ole(43031.57567) - ole(43031.57567)
profit_buy_counter = 0
no_profit_buy_counter = 0
profit_sell_counter = 0
no_profit_sell_counter = 0
max_profit_buy_positions = 0
max_profit_buy_counter = 0
max_profit_sell_positions = 0
max_profit_sell_counter = 0
df = pd.read_csv("C:/P/Crypto/bitcoin_test_normal_276k.csv")
#comparing to max
for index, row in df.iterrows():
a = index + 1
df_slice = df[a:]
if df_slice["Sell"].max() - row["Buy"] > 0:
max_profit_buy_positions += df_slice["Sell"].max() - row["Buy"]
max_profit_buy_counter += 1
for index1, row1 in df_slice.iterrows():
if row["Buy"] < row1["Sell"] :
buy_time += ole(row1["Date"])- ole(row["Date"])
profit_buy_counter += 1
break
else:
no_profit_buy_counter += 1
#comparing to sell
for index, row in df.iterrows():
a = index + 1
df_slice = df[a:]
if row["Sell"] - df_slice["Buy"].min() > 0:
max_profit_sell_positions += row["Sell"] - df_slice["Buy"].min()
max_profit_sell_counter += 1
for index2, row2 in df_slice.iterrows():
if row["Sell"] > row2["Buy"] :
sell_time += ole(row2["Date"])- ole(row["Date"])
profit_sell_counter += 1
break
else:
no_profit_sell_counter += 1
num_rows = len(df.index)
buy_avg_time = buy_time/num_rows
sell_avg_time = sell_time/num_rows
if max_profit_buy_counter == 0:
avg_max_profit_buy = "There is no profitable buy positions"
else:
avg_max_profit_buy = max_profit_buy_positions/max_profit_buy_counter
if max_profit_sell_counter == 0:
avg_max_profit_sell = "There is no profitable sell positions"
else:
avg_max_profit_sell = max_profit_sell_positions/max_profit_sell_counter
The code works fine for 10K-20K lines but for a larger amount (276K) it take a long time (more than 10 hrs)
What can I do in order to improve it?
Is there any "Pythonic" way to compare each value in a data frame to all following values?
note - the dates in the CSV are in OLE so I need to convert it to Datetime.
File for testing:
Thanks for your comment.
Here you can find the file that I used:
First, I'd want to create the cumulative maximum/minimum values for Sell and Buy per row, so it's easy to compare to. pandas has cummax and cummin, but they go the wrong way. So we'll do:
df['Max Sell'] = df[::-1]['Sell'].cummax()[::-1]
df['Min Buy'] = df[::-1]['Buy'].cummin()[::-1]
Now, we can just compare each row:
df['Buy Profit'] = df['Max Sell'] - df['Buy']
df['Sell Profit'] = df['Sell'] - df['Min Buy']
I'm positive this isn't exactly what you want as I don't perfectly understand what you're trying to do, but hopefully it leads you in the right direction.
After comparing your function and mine, there is a slight difference, as your a is offset one off the index. Removing that offset, you'll see that my method produces the same results as yours, only in vastly shorter time:
for index, row in df.iterrows():
a = index
df_slice = df[a:]
assert (df_slice["Sell"].max() - row["Buy"]) == df['Max Sell'][a] - df['Buy'][a]
else:
print("All assertions passed!")
Note this will still take the very long time required by your function. Note that this can be fixed with shift, but I don't want to run your function for long enough to figure out what way to shift it.

Python: Why am I getting a "ZeroDivisionError: division by zero" in function?

My goal is to count the total amount of tweets in a file that fall under certain time zones.
I have the following function (I have noted the trouble area near the end of the function with comments):
def readTweets(inFile, wordsName):
words = []
lat = 0
long = 0
keyword = keywords(wordsName)
sents = keywordSentiment(wordsName)
value = 0
eastern = 0
central = 0
mountain = 0
pacific = 0
a = 0
b = 0
c = 0
d = 0
easternTweets = 0
centralTweets = 0
mountainTweets = 0
pacificTweets = 0
for line in inFile:
entry = line.split()
for n in range(0, len(entry) - 1):
entry[n] = entry[n].strip("[],!?#./-=+_#")
if n > 4: # n>4 because words begin on 5th index of list
entry[n] = entry[n].lower()
words.append(entry[n])
lat = float(entry[0])
long = float(entry[1])
timezone = getTimeZone(lat, long)
if timezone == "eastern":
easternTweets += 1
if timezone == "central":
centralTweets += 1
if timezone == "mountain":
mountainTweets += 1
if timezone == "pacific":
pacificTweets += 1
for i in range(0, len(words)):
for k in range(0, len(keyword)):
if words[i] == keyword[k]:
value = int(sents[k])
if timezone == "eastern":
eastern += value
a += 1
if timezone == "central":
central += value
b += 1
if timezone == "mountain":
mountain += value
c += 1
if timezone == "pacific":
pacific += value
d += 1
# the values of a,b,c,d are 0
easternTotal = eastern/a # getting error
centralTotal = central/b # for
mountainTotal = mountain/c # these
pacificTotal = pacific/d # values
print("Total tweets per time zone:")
print("Eastern: %d" % easternTweets)
print("Central: %d" % centralTweets)
print("Mountain: %d" % mountainTweets)
print("Pacific: %d" % pacificTweets)
I am getting a ZeroDivisionError: division by zero error for easternTotal and the other total values that use a, b, c, and d for division.
If I print the values of a, b, c, or d it shows 0. My question is why are their values 0? Does the value of a, b, c, and d not change in the if statements?
So the only way this can happen is because the code that increments a, b, c and d is never reached.
That can have a few reasons:
inFile is empty so the whole for loop never enters its body
len(words) is 0, so that for loop never enters its body
len(keywords) is 0, so that for loop never enters its body
The value of timezone is something other than those values you test for
words is initially [], so its length can stay 0 if that loop that appends things to it never runs.
From here, it's impossible for us to see which of these is happening, but it should be very easy for you with some print statements or such.
you divide eastern by 0. You can avoid it by doing
easternTotal = eastern/a if a > 0 else eastern
because you set a,b,c,d=0;
when readTweets(inFile, wordsName) did not get any data, "eastern/a" may cause "eastern/0 " .
So, make sure your readTweets() did get data first.

Categories