This question already has answers here:
How to replace NaN values by Zeroes in a column of a Pandas Dataframe?
(17 answers)
Closed 1 year ago.
for example i want to replace 'NAN' with 'dog' and 'cat'. like from 1-30 'Nan' should be replaced with 'dog' and from 40-100 it should be replaced by 'cat'. how am i supposed to do it
Spilt your problem into smaller ones:
How to select the data? 1-30, 40-100
dataframe.iloc[0:30]
dataframe.iloc[30:100]
How to replace NaN?
dataframe.fillna('dog')
You can use fillna with a subset of your dataset (df):
df.loc[1:30, :].fillna('dog', inplace=True)
df.loc[40:100, :].fillna('cat', inplace=True)
Related
This question already has answers here:
Python Pandas Counting the Occurrences of a Specific value
(8 answers)
Closed 2 months ago.
I have a pandas dataframe with a column that is populated by "yes" or "no" strings.
When I do .value_counts() to this column, i receive the correct distribution.
But, when I run .isna() it shows that the whole column is NaNs.
I suspect later it creates problems for me.
Example:
df = pd.DataFrame(np.array([[0,1,2,3,4],[40,30,20,10,0], ['yes','yes','no','no','yes']]).T, columns=['A','B','C'])
len(df['C'].isna()) # 5 --> why?!
df['C'].value_counts() # yes : 3, no: 2 --> as expected.
len gives you the length of the Series (irrespective of its content), not the number of True values.
Use sum if you want the count of True:
df['C'].isna().sum()
# 0
This question already has answers here:
Replacing few values in a pandas dataframe column with another value
(8 answers)
Closed 4 months ago.
I have a dataframe with multiple columns. I know to change the value based on condition for one specific column. But how can I change the value based on condition over all columns for the whole dataframe? I want to replace // with 1
col1;col2;col3;col4;
23;54;12;//;
54;//;2;//;
8;2;//;1;
Let's try
df = df.replace('//', 1)
# or
df = df.mask(df.eq('//'), 1)
This question already has answers here:
How to drop rows of Pandas DataFrame whose value in a certain column is NaN
(15 answers)
Closed 2 years ago.
My DataFrame has two columns, both of which have NaN values. I need to delete the rows with NaN just on the column user_email.
However, I used df['user_email'] = df['user_email'].dropna() but it returned the exact same DataFrame, with all the NaN values on the second column intact.
How can I delete the rows with NaN on the second column?
You may need inplace=True:
df.dropna(subset=['user_email'], inplace=True)
You can use the subset keyword argument.
df = df.dropna(subset=['user_email'])
You could use boolean indexing. This allows you to select rows based on a conditional statement (e.g., df.user_email.notna())
df = df[df.user_email.notna()]
This question already has answers here:
index of non "NaN" values in Pandas
(3 answers)
Closed 2 years ago.
I have a dataframe called CentroidXY and I want to find the indexes of the rows in the column called 'X' that corresponds to numeric values (not NaN). I tried:
foo = CentroidXY.index[CentroidXY['X'] == int].tolist()
However this gives me back no indexes, although my column contains numeric values. Does anyone have any idea on how to do this?
You could use:
CentroidXY.index[CentroidXY['X'].notna()]
This question already has answers here:
How to drop rows of Pandas DataFrame whose value in a certain column is NaN
(15 answers)
Closed 4 years ago.
I have a large dataframe with a column populated by Nan and integers.
I identified the rows that are NOT empty (i.e. return False for notnull()):
df.loc[df.score.notnull()]
How do I remove these rows and keep the rows with missing values?
This code doesn't work:
df.drop(df.score.notnull()]
Assuming you wanted in the same dataframe you could use:
df = df[df.score.isnull()]
You could use df.loc[df.score.isnull()] or df.loc[~df.score.notnull()].