Rename None in a list under pandas column

Rename None in a list under pandas column - python

Let's say I have the following dataframe:
Value
[None, A, B, C]
[None]
I would like to replace None value in the column with none but it seems I couldn't figure out it.
I used this but not working.
df['Value'] = df['Value'].str.replace('None','none')

None is a built-in type in Python, so if you want to make it lowercase, you have to convert it to a string.
There is no built-in way in Pandas to replace values in lists, but you can use explode to expand all the lists so that each individual item of each list gets its own row in the column, then replace, then group back together into the original list format:
df['Value'] = df['Value'].explode().replace({None: 'none'}).groupby(level=0).apply(list)
Output:
>>> df
Value
0 [none, A, B, C]
1 [none]

Here is a way using map()
df['Value'] = df['Value'].map(lambda x: ['none' if i == None else i for i in x])
Output:
Value
0 [none, A, B, C]
1 [none]

Related

next item of list inside dataframe

I have a dataframe that has a column where each row has a list.
I want to get the next element after the value I am looking for (in another column).
For example:
Let's say I am looking for 'b':
|lists |next_element|
|---------|------------|
|[a,b,c,d]| c | #(c is the next value after b)
|[c,b,a,e]| a | #(a is the next value after b)
|[a,e,f,b]| [] | #(empty, because there is no next value after b)
*All lists have the element. There are no lists without the value I am looking for
Thank you

Try writing a function and use apply.
value = 'b'
def get_next(x):
get_len = len(x)-1
for i in x:
if value.lower() == i.lower():
curr_idx = x.index(i)
if curr_idx == get_len:
return []
else:
return x[curr_idx+1]
df["next_element"] = df["lists"].apply(get_next)
df
Out[649]:
lists next_element
0 [a, b, c, d] c
1 [c, b, a, e] a
2 [a, e, f, b] []

First observation, since you want the next element of a list of string elements, the expected data type should be a string for that column, and not a list.
So, instead of the next_element columns as [c, a, []] its better to use [c, a, None]
Secondly, you should try avoiding apply methods directly over series and instead utilize the str methods that pandas provides for series which is a vectorized way of solving such problems super fast.
With the above in mind, let's try this completely vectorized one-liner -
element = 'b'
df['next_element'] = df.lists.str.join('').str.split(element).str[-1].str[0]
lists next_element
0 [a, b, c, d] c
1 [c, b, a, e] a
2 [a, e, f, b] NaN
First I combine each row as a single string [a,b,c,d]->'abcd`
Next I split this by 'b' to get substrings
I pick the last element from this list and finally the first element from that, for each row, using str functions which are vectorized over each row.
Read more about pandas.Series.str methods on official documentation/tutorial here

df = df.assign(next_element = "")
print(df)
for ind in df.index:
c= df["Lists"][ind]
for i,v in enumerate(c):
if v == "b":
df["next_element"][ind] = c[i+1]
print(df)
Try with this one you will get the exact output what you expected.

Get unique values from multiple lists in Pandas column

How can I join the multiple lists in a Pandas column 'B' and get the unique values only:
A B
0 10 [x50, y-1, sss00]
1 20 [x20, MN100, x50, sss00]
2 ...
Expected output:
[x50, y-1, sss00, x20, MN100]

You can do this simply by list comprehension and sum() method:
result=[x for x in set(df['B'].sum())]
Now If you print result you will get your desired output:
['y-1', 'x20', 'sss00', 'x50', 'MN100']

If in input data are not lists, but strings first create lists:
df.B = df.B.str.strip('[]').str.split(',')
Or:
import ast
df.B = df.B.apply(ast.literal_eval)
Use Series.explode for one Series from lists with Series.unique for remove duplicates if order is important:
L = df.B.explode().unique().tolist()
#alternative
#L = df.B.explode().drop_duplicates().tolist()
print (L)
['x50', 'y-1', 'sss00', 'x20', 'MN100']
Another idea if order is not important use set comprehension with flatten lists:
L = list(set([y for x in df.B for y in x]))
print (L)
['x50', 'MN100', 'x20', 'sss00', 'y-1']

Search values from a list in dataframe cell list and add another column with results

I am trying to create a column with the result of a comparison between a Dataframe cell list and a list
I have this dataframe with list values:
df = pd.DataFrame({'A': [['KB4525236', 'KB4485447', 'KB4520724', 'KB3192137', 'KB4509091']], 'B': [['a', 'b']]})
and a list with this value:
findKBs = ['KB4525236','KB4525202']
The expected result :
A B C
0 [KB4525236, KB4485447, KB4520724, KB3192137, K... [a, b] [KB4525202]
I don´t know how to iterate my list with the cell list and find the non matches, can you help me?

You should simply compare the 2 lists like this: Loop through the values of findKBs and assign them to new list if they are not in df['A'][0]
df['C'] = [[x for x in findKBs if x not in df['A'][0]]]
Result:
A B C
0 [KB4525236, KB4485447, KB4520724, KB3192137, K... [a, b] [KB4525202]

There's probably a pandas-centric way you could do it,but this appears to work:
df['C'] = [list(filter(lambda el: True if el not in df['A'][0] else False, findKBs))]

select column in pandas dataframe whose nested list value matches a given value [duplicate]

In the following example, how do I keep only rows that have "a" in the array present in column tags?
df = pd.DataFrame(columns=["val", "tags"], data=[[5,["a","b","c"]]])
df[3<df.val] # this works
df["a" in df.tags] # is there an equivalent for filtering on tags?

I think using sets is intuitive. Then you can use >= as set containment
df[df.tags.apply(set) >= {'a'}]
val tags
0 5 [a, b, c]
A Numpy alternative would be
tags = df['tags']
n = len(tags)
out = np.zeros(n, np.bool8)
i = np.arange(n).repeat(tags.str.len())
np.logical_or.at(out, i, np.concatenate(tags) == 'a')
df[out]
Per #JonClements
You can use set.issubset in a map (very clever)
df[df.tags.map({'a'}.issubset)]
val tags
0 5 [a, b, c]

Use list comprehension:
df1 = df[["a" in x for x in df.tags]]

you could use apply with a lambda function which tests if 'a' is in arg of lambda:
df.tags.apply(lambda x: 'a' in x)
Result:
0 True
Name: tags, dtype: bool
This can also be used to index your dataframe:
df[df.tags.apply(lambda x: 'a' in x)]
Result:
val tags
0 5 [a, b, c]

Python: Combination with criteria

I have the following list of combinations:
a = [(1,10),(2,8),(300,28),(413,212)]
b = [(8,28), (8,15),(10,21),(28,34),(413,12)]
I want to create a new combination list from these two lists which follow the following criteria:
A. List a and List b have common elements.
The second element of the tuple in list a equals the first element of the
Tuple in the list b.
Combination of List a and List b should form a new combination:
d = [(1,10,21),(2,8,28),(2,8,15),(300,28,34)]
All other tuples in both lists which do not satisfy the criteria get ignored.
QUESTIONS
Can I do this criteria based combination using itertools?
What is the most elegant way to solve this problem with/without using modules?
How can one display the output in excel sheet to display each element of a tuple in list d to a separate column such as:
d = [(1,10,21),(2,8,28),(2,8,15),(300,28,34)] is displayed in excel as:
Col A = [1, 2, 2, 300]
Col B = [10,8,8,28]
Col C = [21,28,15,34]

pandas works like a charm for excel.
Here is the code:
a = [(1,10),(2,8),(300,28),(413,212)]
b = [(8,28), (8,15),(10,21),(28,34),(413,12)]
c = [(x, y, t) for x, y in a for z, t in b if y == z]
import pandas as pd
df = pd.DataFrame(c)
df.to_excel('MyFile.xlsx', header=False, index=False)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Rename None in a list under pandas column - python

Let's say I have the following dataframe: Value [None, A, B, C] [None] I would like to replace None value in the column with none but it seems I couldn't figure out it. I used this but not working. df['Value'] = df['Value'].str.replace('None','none')

Here is a way using map() df['Value'] = df['Value'].map(lambda x: ['none' if i == None else i for i in x]) Output: Value 0 [none, A, B, C] 1 [none]

Related

next item of list inside dataframe

Get unique values from multiple lists in Pandas column

Search values from a list in dataframe cell list and add another column with results

select column in pandas dataframe whose nested list value matches a given value [duplicate]

Python: Combination with criteria

Categories

Resources