I'm trying to create trend streak that displays 1,-1,0 (win/loss/no movement) from a pandas database. I'm looking for the streak to increase when positive, and reset on 0, or reset and create a negative streak on -1. The desired results would be something like this:
win streak
0 0
1 1
1 2
1 3
1 4
0 0
0 0
-1 -1
-1 -2
1 1
Currently I have this that creates the win column.
dataframe.loc[dataframe['close'] > dataframe['close_1h'].shift(1), 'win'] = 1
dataframe.loc[dataframe['close'] < dataframe['close_1h'].shift(1), 'win'] = -1
dataframe.loc[dataframe['close'] == dataframe['close_1h'].shift(1), 'win'] = 0
dataframe['streak'] = numpy.nan_to_num(dataframe['win'].cumsum())
But that doesn't correctly reset the streaks as I would like it to. I've played around with the groupby doing dataframe['streak'] = dataframe.groupby([(dataframe['win'] != dataframe['win'].shift()).cumsum()]) but that gave me an error resulting in "ValueError: Length of values (927) does not match length of index (1631)"
try this:
df['streak'] = df.groupby(df['win'].diff().ne(0).cumsum())['win'].cumsum()
Related
I have 2 columns that represent the on switch and off switch indicator. I want to create a column called last switch where it keeps record the 'last' direction of the switch (whether it is on or off). Another condition is that if both on and off switch value is 1 for a particular row, then the 'last switch' output will return the opposite sign of the previous last switch. Currently I managed to find a solution to get this almost correct except facing the situation where both on and off shows 1 that makes my code wrong.
I also attached the screenshot with a desired output. Please help appreciate all.
df=pd.DataFrame([[1,0],[1,0],[0,1],[0,1],[0,0],[0,0],[1,0],[1,1],[0,1],[1,0],[1,1],[1,1],[0,1]], columns=['on','off'])
df['last_switch']=(df['on']-df['off']).replace(0,method='ffill')
Add the following lines to your existing code:
for i in range(df.shape[0]):
df['prev']=df['last_switch'].shift()
df.loc[(df['on']==1) & (df['off']==1), 'last_switch']=df['prev'] * (-1)
df.drop('prev', axis=1, inplace=True)
df['last_switch']=df['last_switch'].astype(int)
Output:
on off last_switch
0 1 0 1
1 1 0 1
2 0 1 -1
3 0 1 -1
4 0 0 -1
5 0 0 -1
6 1 0 1
7 1 1 -1
8 0 1 -1
9 1 0 1
10 1 1 -1
11 1 1 1
12 0 1 -1
Let me know if you need expanation of the code
df=pd.DataFrame([[1,0],[1,0],[0,1],[0,1],[0,0],[0,0],[1,0],[1,1],[0,1],[1,0],[1,1],[1,1],[0,1]], columns=['on','off'])
df['last_switch']=(df['on']-df['off']).replace(0,method='ffill')
prev_row = None
def apply_logic(row):
global prev_row
if prev_row is not None:
if (row["on"] == 1) and (row["off"] == 1):
row["last_switch"] = -prev_row["last_switch"]
prev_row = row.copy()
return row
df.apply(apply_logic,axis=1)
personally i am not a big fan of using loop against dataframe. shift wont work in this case as the "last_switch" column is dynamic and subject to change based on on&off status.
Using your intermediate reesult with apply while carrying the value from previous row should do the trick. Hope it makes sense.
How can I replace the values of a DataFrame if are smaller or greater than a particular value?
print(df)
name seq1 seq11
0 seq102 -14 -5.99
1 seq103 -5.25 -7.94
I want to set the values < than -8.5 to 1 and > than -8.5 to 0.
I tried this but all the values gets zero;
import pandas as pd
df = pd.read_csv('df.csv')
num = df._get_numeric_data()
num[num < -8.50] = 1
num[num > -8.50] = 0
The desired output should be:
name seq1 seq11
0 seq102 1 0
1 seq103 0 0
Thank you
Try
num.iloc[:,1:] = num.iloc[:,1:].applymap(lambda x: 1 if x < -8.50 else 0)
Note that values equal to -8.50 will be set to zero here.
def thresh(x):
if(x < -8.5):
return 1
elif(x > -8.5):
return 0
return x
print(df[["seq1", "seq2"]].apply(thresh))
-)I'm working on an automation task in python wherein in each row the 1st negative value should be added up with the 1st non-negative value from the left. Further, the result should replace the positive value and 0 should replace the negative value
-)This process should continue until the entire row contains all negative or all positive values.
**CUSTOMER <30Days 31-60 Days 61-90Days 91-120Days 120-180Days 180-360Days >360Days**
ABC -2 23 2 3 2 2 -1
(>360Days)+(180-360Days)
-1 + 2
CUSTOMER <30Days 31-60 Days 61-90Days 91-120Days 120-180Days 180-360Days >360Days
ABC -2 23 2 3 2 1 0
(<30Days)+(180-360Days)
-2 + 1
CUSTOMER <30Days 31-60 Days 61-90Days 91-120Days 120-180Days 180-360Days >360Days
ABC 0 23 2 3 2 -1 0
(180-360Days)+(120-180Days)
-1 + 2
CUSTOMER <30Days 31-60 Days 61-90Days 91-120Days 120-180Days 180-360Days >360Days
ABC 0 23 2 3 2 0 0
Check this code:
import pandas as pd
#Empty DataFrame
df=pd.DataFrame()
#Enter the data
new_row={'CUSTOMER':'ABC','<30Days':-2,'31-60 Days':23,'61-90Days':2,'91-120Days':3,'120-180Days':2,'180-360Days':2,'>360Days':-1}
df=df.append(new_row,ignore_index=True)
#Keep columns order as per the requirement
df=df[['CUSTOMER','<30Days','31-60 Days','61-90Days','91-120Days','120-180Days','180-360Days','>360Days']]
#Take column names and reverse the order
ls=list(df.columns)
ls.reverse()
#Remove non integer column
ls.remove('CUSTOMER')
#Initialize variables
flag1=1
flag=0
new_ls=[]
new_ls_index=[]
for j in range(len(df)):
while flag1!=0:
#Perform logic
for i in ls:
if int(df[i][j]) < 0 and flag == 0:
new_ls.append(int(df[i][j]))
new_ls_index.append(i)
flag=1
elif flag==1 and int(df[i][j]) >= 0 :
new_ls.append(int(df[i][j]))
new_ls_index.append(i)
flag=2
elif flag==2:
df[new_ls_index[1]]=new_ls[0]+new_ls[1]
df[new_ls_index[0]]=0
flag=0
new_ls=[]
new_ls_index=[]
#Check all values in row either positive or negative
if new_ls==[]:
new_ls_neg=[]
new_ls_pos=[]
for i in ls:
if int(df[i][j]) < 0:
new_ls_neg.append(int(df[i][j]))
if int(df[i][j]) >= 0 :
new_ls_pos.append(int(df[i][j]))
if len(new_ls_neg)==len(ls) or len(new_ls_pos)==len(ls):
flag1=0 #Set flag to stop the loop
I have a pandas data frame that looks like:
Index Activity
0 0
1 0
2 1
3 1
4 1
5 0
...
1167 1
1168 0
1169 0
I want to count how many times it changes from 0 to 1 and when it changes from 1 to 0, but I do not want to count how many 1's or 0's there are.
For example, if I only wanted to count index 0 to 5, the count for 0 to 1 would be one.
How would I go about this? I have tried using some_value
This is a simple approach that can also tell you the index value when the change happens. Just add the index to a list.
c_1to0 = 0
c_0to1 = 0
for i in range(0, df.shape[0]-1):
if df.iloc[i]['Activity'] == 0 and df.iloc[i+1]['Activity'] == 1:
c_0to1 +=1
elif df.iloc[i]['Activity'] == 1 and df.iloc[i+1]['Activity'] == 0:
c_1to0 +=1
Recently for a school project i've been making a "Treasure hunt" where the player finds treasure and bandits on a grid in python. I have a way to have the grid at a set size but, as an extra point they ask for us to be able to change the size of the grid, the amount of chests and the amount of bandits.
Here is the code for my grid maker but it wont make the "grid" array but it does for "playergrid":
def gridmaker(gridsize, debug):
global grid
global playergrid
gridinator = 1
grid = [[0]]
playergrid = [[" "]]
if debug == 1:
while gridinator <= gridsize:
grid[gridinator].append(0)
gridinator = gridinator + 1
gridinator = 1
else:
while gridinator <= gridsize:
playergrid[0].append(gridinator)
gridinator = gridinator + 1
gridinator = 1
while gridinator <= gridsize:
if debug == 1:
grid.append([0])
for i in range(gridsize):
grid[gridinator].append(0)
else:
playergrid.append([gridinator])
for i in range(gridsize):
playergrid[gridinator].append("#")
gridinator = gridinator+1
if debug == 1:
grid[1][1] = 1
else:
playergrid[1][1] = "P"
gridmaker(9, 1)
for row in grid:
print(" ".join(map(str,row)))
Sorry if it is formatted differently as there are 2 space tabs rather than 4, it works best on repl.it
print(grid) should return a grid like this:
0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
Please let me know,
Thanks!
You have to remember that lists are 0-indexed.
Which means that to access the 1st element of the grid list you would use the index 0.
With grid = [[0]] you create a list with one item (you can get that item with grid[0]), which is a list whose 1st item (grid[0][0]) is 0.
But your gridinator's starting value is 1. So when your first append runs:
grid[gridinator].append(0)
it tries to access the 2nd element of grid:
grid[1].append(0)
Which gives you an IndexError since, as the traceback should tell you* list index out of range.
You can try this yourself:
grid = [[0]]
grid[0]
grid[1]
One of your solutions could be starting the gridinator with 0, and using strict less instead of less or equal here: gridinator <= gridsize (because grid[8] gives you the 9th element of the grid).
*Please remember to include the traceback for errors in the future. They really help both yourself and the people trying to help you.
Let me know if this helps, or if I should find another way to explain it.