Pandas count of sequence of positive and negative numbers

Pandas count of sequence of positive and negative numbers - python

Problem is probably simple, but my brain doesn't work as expected.
Imagine you have this Panda Series:
y = pd.Series([5, 5 , -5 , -10, 7 , 7 ])
z = y * 0
I would like to have output:
1, 2 , -1 ,-2 ,1 ,2
My solution below:
for i, row in y.iteritems():
if i == 0 and y[i] > 0:
z[i] = 1
elif i == 0:
z[i] = -1
elif y[i] >= 0 and y[i-1] >= 0:
z[i] = 1 + z[i-1]
elif y[i] < 0 and y[i-1] < 0:
z[i] = -1 + z[i-1]
elif y[i] >= 0 and y[i-1] < 0:
z[i] = 1
elif y[i] < 0 and y[i-1] >= 0:
z[i] = -1
I would think there is a more Python/Panda solution.

You can use np.sign() to check if the number is positive/negative ans compare it to the next row using shift(). Finally, use cumcount() to sum each sub series
y = pd.Series([5, 5 , -5 , -10, 7 , 7 ])
parts = (np.sign(y) != np.sign(y.shift())).cumsum()
print((y.groupby(parts).cumcount() + 1) * np.sign(y))
# or print(y.groupby(parts).cumcount().add(1).mul(np.sign(y)))
Output
0 1
1 2
2 -1
3 -2
4 1
5 2

Turning points in terms of sign are found via looking at difference not being 0 when subjected to np.sign. Then cumulative sum of this gives consecutive groups of same sign. We lastly put cumcount to number each group and also multiply by the sign to get negative counts:
signs = np.sign(y)
grouper = signs.diff().ne(0).cumsum()
result = y.groupby(grouper).cumcount().add(1).mul(signs)
where add(1) is because cumcount gives 0, 1, .. but we need 1 more.
>>> result
0 1
1 2
2 -1
3 -2
4 1
5 2

Related

Find index of less or equal value in list recursively (python)

Got task to find indexes of value and double that value in input list.
In input first line we get list range, second - list of values, third - value to find.
The output is 2 numbers - indexes of equal or higher value and double the value. If there is none, return -1
Example input:
6
1 2 4 4 6 8
3
Example output:
3 5
What i got so far is standart binary search func, but i dont get how to make it search not only for exact number but nearest higher.
def binarySearch(arr, x, left, right):
if right <= left:
return -1
mid = (left + right) // 2
if arr[mid] >= x:
return mid
elif x < arr[mid]:
return binarySearch(arr, x, left, mid)
else:
return binarySearch(arr, x, mid + 1, right)
def main():
n = int(input())
k = input().split()
q = []
for i in k:
q.append(int(i))
s = int(input())
res1 = binarySearch(q, s, q[0], (n-1))
res2 = binarySearch(q, (s*2), q[0], (n-1))
print(res1, res2)
if __name__ == "__main__":
main()
The input is:
6
1 2 4 4 6 8
3
And output:
3 4

Here's a modified binary search which will return the base zero index of a value if found or the index of the next highest value in the list.
def bsearch(lst, x):
L = 0
R = len(lst) - 1
while L <= R:
m = (L + R) // 2
if (v := lst[m]) == x:
return m
if v < x:
L = m + 1
else:
R = m - 1
return L if L < len(lst) else -1
data = list(map(int, '1 2 4 4 6 8'.split()))
for x in range(10):
print(x, bsearch(data, x))
Output:
0 0
1 0
2 1
3 2
4 2
5 4
6 4
7 5
8 5
9 -1

Python arithmetic function with recurssion

This might be done with a rolling function in pandas probably, not sure but I would like to apply the following function for a list, the current state S in position x is defined as
S[x] = if S[x-1] > 0 S[x-1] -1 + S[x] else S[x] -1 for x > 1
It can be understood as the current state -1 and the current state... This is because I need to do a kind of cumulative sum of all the previous positions -1 + the current positon.
An example for the list
[1,1,2,0,0,2]
returns this values
[0,0,1,0,-1,1]
because:
S[0] = 1 - 1 = 0
S[1] = S[1] - 1 + S[0] = 1 - 1 + 0 = 0
S[2] = S[2] - 1 + S[1] = 2 - 1 + 0 = 1
S[3] = S[3] - 1 + S[2] = 0 - 1 + 1 = 0
S[4] = S[4] - 1 + S[3] = 0 - 1 + 0 = -1
S[5] = S[5] - 1 (no S[4] because the else rule being smaller than 0) = 2 - 1 = 1
I am pretty sure this can probably be done in pandas but I am also open to a standard python function I send a list to (prefer pandas though).
Have been trying recursion and failed miserably.

subtract 1 then use cumsum
(s-1).cumsum()
0 0
1 0
2 1
3 0
4 -1
5 0
here you go, revised solution to accommodate of condition in calculating cumulative sum
np.where(((s.shift(1) - 1).cumsum()) > 0,
(s-1).cumsum(),
s-1)
[ 0, 0, 1, 0, -1, 1]

Replacing positive, negative, and zero values by 1, -1, and 0 respectively

I have a pandas dataframe(100,000 obs) with 11 columns.
I'm trying to assign df['trade_sign'] values based on the df['diff'] (which is a pd.series object of integer values)
If diff is positive, then trade_sign = 1
if diff is negative, then trade_sign = -1
if diff is 0, then trade_sign = 0
What I've tried so far:
pos['trade_sign'] = (pos['trade_sign']>0) <br>
pos['trade_sign'].replace({False: -1, True: 1}, inplace=True)
But this obviously doesn't take into account 0 values.
I also tried for loops with if conditions but that didn't work.
Essentially, how do I fix my .replace function to take account of diff values of 0.
Ideally, I'd prefer a solution that uses numpy over for loops with if conditions.

There's a sign function in numpy:
df["trade_sign"] = np.sign(df["diff"])
If you want integers,
df["trade_sign"] = np.sign(df["diff"]).astype(int)

a = [-1 if df['diff'].values[i] < 0 else 1 for i in range(len(df['diff'].values))]
df['trade_sign'] = a

You could do it this way:
pos['trade_sign'] = (pos['diff'] > 0) * 1 + (pos['diff'] < 0) * -1
The boolean results of the element-wise > and < comparisons automatically get converted to int in order to allow multiplication with 1 and -1, respectively.
This sample input and test code:
import pandas as pd
pos = pd.DataFrame({'diff':[-9,0,9,-8,0,8,-7-6-5,4,3,2,0]})
pos['trade_sign'] = (pos['diff'] > 0) * 1 + (pos['diff'] < 0) * -1
print(pos)
... gives this output:
diff trade_sign
0 -9 -1
1 0 0
2 9 1
3 -8 -1
4 0 0
5 8 1
6 -18 -1
7 4 1
8 3 1
9 2 1
10 0 0
UPDATE: In addition to the solution above, as well as some of the other excellent ideas in other answers, you can use numpy where:
pos['trade_sign'] = np.where(pos['diff'] > 0, 1, np.where(pos['diff'] < 0, -1, 0))

A test interview question I could not figure out

So I wrote a piece of code in pycharm
to solve this problem:
pick any 5 positive integers that add up to 100
and by addition,subtraction or just using one of the five values
you should be able to make every number up to 100
for example
1,22,2,3,4
for 1 I could give in 1
for 2 i could give in 2
so on
for 21 I could give 22 - 1
for 25 I could give (22 + 2) - 1
li = [1, 1, 1, 1, 1]
lists_of_li_that_pass_T1 = []
while True:
if sum(li) == 100:
list_of_li_that_pass_T1.append(li)
if li[-1] != 100:
li[-1] += 1
else:
li[-1] = 1
if li[-2] != 100:
li[-2] += 1
else:
li[-2] = 1
if li[-3] != 100:
li[-3] += 1
else:
li[-3] = 1
if li[-4] != 100:
li[-4] += 1
else:
li[-4] = 1
if li[-5] != 100:
li[-5] += 1
else:
break
else:
if li[-1] != 100:
li[-1] += 1
else:
li[-1] = 1
if li[-2] != 100:
li[-2] += 1
else:
li[-2] = 1
if li[-3] != 100:
li[-3] += 1
else:
li[-3] = 1
if li[-4] != 100:
li[-4] += 1
else:
li[-4] = 1
if li[-5] != 100:
li[-5] += 1
else:
break
this should give me all the number combinations that add up to 100 out of the total 1*10 ** 10
but its not working please help me fix it so it prints all of the sets of integers
I also can't think of what I would do next to get the perfect sets that solve the problem

After #JohnY comments, I assume that the question is:
Find a set of 5 integers meeting the following requirements:
their sum is 100
any number in the [1, 100] range can be constructed using at most once the elements of the set and only additions and substractions
A brute force way is certainly possible, but proving that any number can be constructed that way would be tedious. But a divide and conquer strategy is possible: to construct all numbers up to n with a set of m numbers u0..., um-1, it is enough to build all numbers up to (n+2)/3 with u0..., um-2 and use um-1 = 2*n/3. Any number in the ((n+2)/3, um-1) range can be written as um-1-x with x in the [1, (n+2)/3] range, and any number in the (um-1, n] range as um-1+y with y in the same low range.
So we can use here u4 = 66 and find a way to build numbers up to 34 with 4 numbers.
Let us iterate: u3 = 24 and build numbers up to 12 with 3 numbers.
One more step u2 = 8 and build numbers up to 4 with 2 numbers.
Ok: u0 = 1 and u1 = 3 give immediately:
1 = u0
2 = 3 - 1 = u1 - u0
3 = u1
4 = 3 + 1 = u1 + u0
Done.
Mathematical disgression:
In fact u0 = 1 and u1 = 3 can build all numbers up to 4, so we can use u2 = 9 to build all numbers up to 9+4 = 13. We can prove easily that the sequence ui = 3i verifies sum(ui for i in [0, m-1]) = 1 + 3 + ... + 3m-1 = (3m - 1)/(3 - 1) = (um - 1) / 2.
So we could use u0=1, u1=3, u2=9, u3=27 to build all numbers up to 40, and finally set u4 = 60.
In fact, u0 and u1 can only be 1 and 3 and u2 can be 8 or 9. Then if u2 == 8, u3 can be in the [22, 25] range, and if u2 == 9, u3 can be in the [21, 27] range. The high limit is given by the 3i sequence, and the low limit is given by the requirement to build numbers up to 12 with 3 numbers, and up to 34 with 4 ones.
No code was used, but I think that way much quicker and less error prone. It is now possible to use Python to show that all numbers up to 100 can be constructed from one of those sets using the divide and conquer strategy.

How do I add to a grid coordinate in python?

What I'm trying to do is have a 2D array and for every coordinate in the array, ask all the other 8 coordinates around it if they have stored a 1 or a 0. Similar to a minesweeper looking for mines.
I used to have this:
grid = []
for fila in range(10):
grid.append([])
for columna in range(10):
grid[fila].append(0)
#edited
for fila in range (10):
for columna in range (10):
neighbour = 0
for i in range 10:
for j in range 10:
if gird[fila + i][columna + j] == 1
neighbour += 1
But something didn't work well. I also had print statments to try to find the error that way but i still didnt understand why it only made half of the for loop. So I changed the second for loop to this:
#edited
for fila in range (10):
for columna in range (10):
neighbour = 0
if grid[fila - 1][columna - 1] == 1:
neighbour += 1
if grid[fila - 1][columna] == 1:
neighbour += 1
if grid[fila - 1][columna + 1] == 1:
neighbour += 1
if grid[fila][columna - 1] == 1:
neighbour += 1
if grid[fila][columna + 1] == 1:
neighbour += 1
if grid[fila + 1][columna - 1] == 1:
neighbour += 1
if grid[fila + 1][columna] == 1:
neighbour += 1
if grid[fila + 1][columna + 1] == 1:
neighbour += 1
And got this error:
if grid[fila - 1][columna + 1] == 1:
IndexError: list index out of range
It seems like I can't add on the grid coordinates but I can subtract. Why is that?

Valid indices in python are -len(grid) to len(grid)-1. the positive indices are accessing elements with offset from the front, the negative ones from the rear. adding gives a range error if the index is greater than len(grid)-1 that is what you see. subtracting does not give you a range error unless you get an index value less than -len(grid). although you do not check for the lower bound, which is 0 (zero) it seems to work for you as small negative indices return you values from the rear end. this is a silent error leading to wrong neighborhood results.

If you are computing offsets, you need to make sure your offsets are within the bounds of the lists you have. So if you have 10 elements, don't try to access the 11th element.
import collections
grid_offset = collections.namedtuple('grid_offset', 'dr dc')
Grid = [[0 for c in range(10)] for r in range(10)]
Grid_height = len(Grid)
Grid_width = len(Grid[0])
Neighbors = [
grid_offset(dr, dc)
for dr in range(-1, 2)
for dc in range(-1, 2)
if not dr == dc == 0
]
def count_neighbors(row, col):
count = 0
for nb in Neighbors:
r = row + nb.dr
c = col + nb.dc
if 0 <= r < Grid_height and 0 <= c < Grid_width:
# Add the value, or just add one?
count += Grid[r][c]
return count
Grid[4][6] = 1
Grid[5][4] = 1
Grid[5][5] = 1
for row in range(10):
for col in range(10):
print(count_neighbors(row, col), "", end='')
print()
Prints:
$ python test.py
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 0 0
0 0 0 1 2 3 1 1 0 0
0 0 0 1 1 2 2 1 0 0
0 0 0 1 2 2 1 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0

The error is exactly what it says, you need to check if the coordinates fit within the grid:
0 <= i < 10 and 0 <= j < 10
Otherwise you're trying to access an element that doesn't exist in memory, or an element that's not the one you're actually thinking about - Python handles negative indexes, they're counted from the end.
E.g. a[-1] is the last element, exactly the same as a[len(a) - 1].

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pandas count of sequence of positive and negative numbers - python

Related

Find index of less or equal value in list recursively (python)

Python arithmetic function with recurssion

Replacing positive, negative, and zero values by 1, -1, and 0 respectively

A test interview question I could not figure out

How do I add to a grid coordinate in python?

Categories

Resources