This question already has answers here:
Filter pandas DataFrame by substring criteria
(17 answers)
Closed 1 year ago.
df.loc[df['name'] == 'Mary']
The above get rows if the 'name' is Mary. What about if I want rows that contains 'Mary', not exactly equal 'Mary'?
You can use pd.Series.str.contains() method to achieve this.
df[df['name'].str.contains('Mary')]
This question already has answers here:
How to filter Pandas dataframe using 'in' and 'not in' like in SQL
(11 answers)
Closed 3 years ago.
Hi I am trying to remove a row if column value is equal to more than one value. The following example shows how to compare one value to delete. For example I want to remove if my column value is "a1" or "b1"
Also my column header name is 'Sky Product' which has space in between and hence i have used this method. Thanks.
df = df[df['Sky Product'] != 'a1']
I think you need:
df = df[~df["Sky Product"].isin(["a1","b1"])]
Try using:
df = df[(df['Sky Product']!= 'a1') & (df['Sky Product']!= 'b1')]
Or if you have too many values to separate them as such, you can together do:
r=['a1','b1',....]
df[~df['Sky Product'].isin(r)]
This question already has answers here:
How to unnest (explode) a column in a pandas DataFrame, into multiple rows
(16 answers)
Closed 3 years ago.
This is my original dataframe. Every row has an email and a list of addresses (There is just street to exemplify).
email addresses
somename#gmail.com [{'street': 'a'}, {'street': 'b'}]
anothername#gmail.com [{'street': 'c'}]
And I expect this result:
email street
somename#gmail.com 'a'
somename#gmail.com 'b'
anothername#gmail.com 'c'
Is there a better way in pandas than iterate over the array to create this last dataframe?
You can use:
df1=pd.DataFrame({'email':df.email.repeat(df.addresses.str.len()),\
'addresses':np.concatenate(df.addresses.values)})
df1['street']=df1.pop('addresses').apply(pd.Series)
print(df1)
email street
0 somename#gmail.com a
0 somename#gmail.com b
1 anothername#gmail.com c
This question already has an answer here:
sort pandas dataframe based on list
(1 answer)
Closed 4 years ago.
I have a df and I want to reorder it based on athe list as shown using Python:
df=pd.DataFrame({'Country':["AU","DE","UR","US","GB","SG","KR","JP","CN"],'Stage #': [3,2,6,6,3,2,5,1,1],'Amount':[4530,7668,5975,3568,2349,6776,3046,1111,4852]})
df
list=["US","CN","GB","AU","JP","KR","UR","DE","SG"]
How can I do that? Any thoughts? Thanks!
Use pd.Categorical
list_ = ["US","CN","GB","AU","JP","KR","UR","DE","SG"]
df['Country'] = pd.Categorical(df.Country, categories = list_, ordered = True)
df.sort_values(by='Country')
Also, do not name your variable list because that would override the built-in list command
This question already has answers here:
How to access pandas groupby dataframe by key
(6 answers)
Closed 9 years ago.
I have grouped sum data in a dataframe, by the following:
groups = df.groupby(['name'])
Now I can get the head of the groups by groups.head(2) which gives the first two rows.
But how do I get a group by a specific name? i.e. if I want the single group where the group name is 'ruby', I can't just do groups['ruby']
How about:
groups.get_group('name')
For more elaboration, see this related question