I have a dataframe with 2 column, 1 containing number and the other containing list, I want to check if the column with number is in the column with the list
Testing data:
preds = [[40, 50, 21], [40, 50, 25], [40, 50, 21]]
target = [40, 50, 40]
df_testing = pd.DataFrame(list(zip(preds, target)),
columns =['preds', 'target'])
so for example, in the first row, I want to check if 40 is in [40, 50, 21], for 2nd row, I want to check if 50 is in [40, 50, 25] etc.
desired result: Return a Serie of True False with the index of the row
Use in stamement:
preds = [[44, 55, 21], [40, 50, 25], [40, 50, 21]]
target = [40, 50, 40]
df_testing = pd.DataFrame(list(zip(preds, target)),
columns =['preds', 'target'])
s = df_testing.apply(lambda x: x['target'] in x['preds'], axis=1)
print (s)
0 False
1 True
2 True
dtype: bool
Related
How do I retrieve Pandas dataframe rows that its column (one) values are consecutively equal to the values of a list?
Example, given this:
import pandas as pd
df = pd.DataFrame({'col1': [10, 20, 30, 40, 50, 88, 99, 30, 40, 50]})
lst = [30, 40, 50]
I want to extract the dataframe rows from 30 to 50, but just the first sequence of consecutive values (just the 2 to 4 index rows).
this should do the trick:
df = pd.DataFrame({'col1': [10, 20, 30, 40, 50, 88, 99, 30, 40, 50]})
lst = [30, 40, 50]
ans=[]
for i,num in enumerate(df['col1']):
if num in lst:
lst.remove(num)
ans.append(i)
print(ans)
You can use a rolling comparison:
s = df['col1'][::-1].rolling(len(lst)).apply(lambda x: x.eq(lst[::-1]).all())[::-1].eq(1)
if s.any():
idx = s.idxmax()
out = df.iloc[idx:idx+len(lst)]
print(out)
else:
print('Not found')
output:
col1
2 30
3 40
4 50
Try:
lst = [30, 40, 50]
if any(lst == (found := s).to_list() for s in df["col1"].rolling(len(lst))):
print(df.loc[found.index])
Prints:
col1
2 30
3 40
4 50
I have this matrix, result of 4 lists:
results= [
[20, 20, 20, 60],
[35, 35, 20, 80],
[10, 10, 10, 30],
[40, 40, 40, 130]
]
The fourth element of each list is the result of the sum of the other elements (for instance in the first list: 20+20+20=60.
There are some errors in some lists that must be figured out with an automatic function. So it is not valid to replace manually the wrong element by another correct one such as: notas[3][3] = 120. I need an automation to run the operation with no errors.
You don't need to check the sum, just recompute it and set the result to the last item of the current list.
for r in results:
r[-1] = sum(r[:-1])
print(results)
# Output:
[[20, 20, 20, 60],
[35, 35, 20, 90],
[10, 10, 10, 30],
[40, 40, 40, 120]]
I want to rank how many of other cols in df is greater than or equal to a reference col. Given testdf:
testdf = pd.DataFrame({'RefCol': [10, 20, 30, 40],
'Col1': [11, 19, 29, 40],
'Col2': [12, 21, 28, 39],
'Col3': [13, 22, 31, 38]
})
I am using the helper function:
def sorter(row):
sortedrow = row.sort_values()
return sortedrow.index.get_loc('RefCol')
as:
testdf['Score'] = testdf.apply(sorter, axis=1)
With actual data this method is very slow, how to speed it up? Thanks
Looks like you need to compare RefCol and check if there are any column less than the RefCol , use:
testdf.lt(testdf['RefCol'],axis=0).sum(1)
0 0
1 1
2 2
3 2
For greater than equal to use:
testdf.drop('RefCol',1).ge(testdf.RefCol,axis=0).sum(1)
I have a list inside a loop for e.g
A=[25,45,34,....87]
in the next iteration A should be
A=[[25,32],[45,13],[34,65],....[87,54]]
in the next iteration A should be
A=[[25,32,44],[45,13,67],[34,65,89],....[87,54,42]]
and so on.How can i do that?is it possible?The code i am working on is
s=0
e=25
for i in range(0,4800):
if not m_list_l:
m_list_l.append(max(gray_sum[s:e]))
m_list_l[i].append(max(gray_sum[s:e]))
s+=25
e+=25
But this give me Error as
m_list_l[i].append(max(gray_sum[s:e]))
AttributeError: 'int' object has no attribute 'append'
The first element you insert should be a list, not an int. Change m_list_l.append(max(gray_sum[s:e])) to m_list_l.append([max(gray_sum[s:e])]) to fix this.
Say there are two lists as
A = [i for i in range(10,100,10)]
A
[10, 20, 30, 40, 50, 60, 70, 80, 90]
B = [i for i in range(20,100,10)]
B
[20, 30, 40, 50, 60, 70, 80, 90, 100]
The combined list would be
L = [[i,j] for i,j in zip(A,B)]
L
[[10, 20],
[20, 30],
[30, 40],
[40, 50],
[50, 60],
[60, 70],
[70, 80],
[80, 90],
[90, 100]]
I want to explore every possible community allocation of 10 nodes. I have total 10 items: 10 15 25 30 45 50 65 75 80 90 There are two lists (communities) c1 and c2 that I will allocate these items. Initially, I split the 10 items like following:
c1 = [10, 45, 50, 75, 90] c2 = [15, 25, 30, 65, 80]
Now I want to move one item to another list like:
c1 = [45, 50, 75, 90] c2 = [10, 15, 25, 30, 65, 80]
c1 = [10, 45, 50, 75] c2 = [15, 25, 30, 65, 80, 90]
...
I also want to move two items, three items, four items, (but not five items). Like,
c1 = [50, 75, 90] c2 = [10, 15, 25, 30, 45, 65, 80]
c1 = [10, 75, 90] c2 = [15, 25, 30, 45, 50, 65, 80]
...
c1 = [75, 90] c2 = [10, 15, 25, 30, 45, 50, 65, 80]
c1 = [10, 90] c2 = [15, 25, 30, 45, 50, 65, 75, 80]
...
c1 = [90] c2 = [10, 15, 25, 30, 45, 50, 65, 75, 80]
c1 = [45] c2 = [10, 15, 25, 30, 50, 65, 75, 80, 90]
...
I want to move every possible iterations of 1-4 items from c1 to c2. (Total 31 possibilities: 2^5-1) The order inside each list doesn't matter. How can I do this?
I used the following code.
c1 = [10, 45, 50, 75, 90]
c2 = [15, 25, 30, 65, 80]
for i in c1:
c2.append(i)
c1.remove(i)
print c1, c2
With this code, I can only get following result. This code didn't accomplish the task of moving one item to c2. My code didn't attempt to move multiple items to c2.
[45, 50, 75, 90] [15, 25, 30, 65, 80, 10]
[45, 75, 90] [15, 25, 30, 65, 80, 10, 50]
[45, 75] [15, 25, 30, 65, 80, 10, 50, 90]
How can I successfully finish the task of moving items to c2? With this task, I can get every possible allocation of 10 items to two lists (ignoring cases c1==c2).
Try:
c1.append(c2.pop(i))
c1.sort()
OR
c2.append(c1.pop(i))
c2.sort()
where:
i - index list
As far as I understand you are more interested in the algorithm instead of simply appending from one list to another.
There is a standard library function which provides combinations of an iterable.
It is really a good exercise to make your own combinations function.
Quick and dirty solution to your problem:
import itertools
c1 = [10, 45, 50, 75, 90]
c2 = [15, 25, 30, 65, 80]
print c1, c2
for i in range(1, 5):
for c in itertools.combinations(c1, i):
mc1 = sorted(list(set(c1).difference(set(c))))
mc2 = sorted(list(set(c2).union(c)))
print mc1, mc2
If you want to create every possible allocation of 10 items to 2 lists, then I would use combinations in the itertools package. For example:
import itertools
items = [10, 25, 45, 50, 15, 30, 65, 75, 80, 90]
for m in xrange(len(items)+1):
combinations = list(itertools.combinations(items, m))
for c1 in combinations:
c1 = list(c1)
c2 = list(set(items) - set(c1))
print c1, c2
The following will move items from one list to another without the incorrect iterator position issue you were facing in the original problem:
c1 = [10, 45, 50, 75, 90]
c2 = [15, 25, 30, 65, 80]
while c1:
c2.append(c1[0])
del c1[0]
print (c1, c2)