Getting rid of Error from Fillna in pandas - python

I tried filling the NA values of a column in a dataframe with:
df1 = data.copy()
df1.columns = data.columns.str.lower()
df2 = df1[['passangerid', 'trip_cost','class']]
df2['class'] = df2['class'].fillna(0)
df2
Although getting this error:
:5: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation:
df2['class'] = df2['class'].fillna(0, axis = 0)
Can someone please help?

First of all I'd advise you to follow the warning message and read up on the caveats in the provided link.
You're getting this warning (not an error) because your df2 is a slice of your df1, not a separate DataFrame.
To avoid getting this warning you can use .copy() method as:
df2 = df1[['passangerid', 'trip_cost','class']].copy()

Related

fillna and copy of a slice problem even after .loc

I am trying to fillna a column of dataframe like the following,
df['temp'] = df['temp'].fillna(method='ffill')
and I am getting,
var/folders/qp/lp_5yt3s65q_pj__6v_kdvnh0000gn/T/ipykernel_10842/2929940072.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df['temp'] = df['temp'].fillna(method='ffill')
I revised the code to the following but I am still getting the same error. Do you have any suggestions?
df.loc[:,'temp'] = df['temp'].fillna(method='ffill')

Error when try adding a column to a dataframe

I am trying first to slice a some columns from original dataframe and then add the additional column 'INDEX' to the last column.
df = df.iloc[:, np.r_[10:17]] #col 0~6
df['INDEX'] = df.index #col 7
I have the error message of second line saying 'A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead'
Why am I seeing this and how should I solve it?
I would do
df.loc[:,'INDEX'] = df.index
by default Python does shallow copy of dataframe. So whatever operations are performed on dataframe, it will actually performed on originall data frame. and the message is exactly indicates that.
Either of below will make the Python interpreter happy 😃 :
df = df.iloc[:, np.r_[10:17]].copy()
or
df.loc[:, ['INDEX']] = df.index

Why do I get "a value is trying to be set on a copy of a slice" when creating a new column with apply?

I have something like this,
df1 = ...
df1['NEW_COLUMN'] = df1['SOME_COLUMN'].apply(lambda x: ...)
Although this works and I get the column 'NEW_COLUMN' added to the dataframe, I get this following annying warning. Why? And what is the solution?
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
If you simply want to avoid getting warned, you can set it in pandas options. If you understand why the warning is, and why is it happening then you can simply ignore it by adding this after importing pandas:
pd.options.mode.chained_assignment = None
Add copy() to avoid getting this warning
df = pd.DataFrame({"Value" : [0.12,0.22,0.32,0.11,0.54,0.55,0.98]})
df['Category'] = df.Value.apply(lambda x: 'Neg' if x < 0.5 else 'Pos').copy()

how to remove this warning in python 3

trying to lower and strip a column in python 3 using panda, but getting the warning-- what is the right way so this warning will not come up
df["col1"] = df[["col1"]].apply(lambda x: x.str.strip())
df["col1"] = df[["col1"]].apply(lambda x: x.str.lower())
The warning
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self[k1] = value[k2]
how to remove the warning
To get rid of this warning apply it to a series instead of a dataframe. Using df[["col1"]] is creating a new dataframe that you are then setting to the column. If you instead just modify the column it'll be fine. Additionally, I chained the two together.
df["col1"] = df["col1"].str.strip().str.lower()

df.loc causes a SettingWithCopyWarning warning message

The following line of my code causes a warning :
import pandas as pd
s = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
s.loc[-1] = [5,np.nan,np.nan,6]
grouped = s.groupby(['A'])
for key_m, group_m in grouped:
group_m.loc[-1] = [10,np.nan,np.nan,10]
C:\Anaconda3\lib\site-packages\ipykernel\__main__.py:10: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
According to the documentation this is the recommended way of doing, so what is happening ?
Thanks for your help.
The documentation is slightly confusing.
Your dataframe is a copy of another dataframe. You can verify this by running bool(df.is_copy) You are getting the warning because you are trying to assign to this copy.
The warning/documentation is telling you how you should have constructed df in the first place. Not how you should assign to it now that it is a copy.
df = some_other_df[cols]
will make df a copy of some_other_df. The warning suggests doing this instead
df = some_other_df.loc[:, [cols]]
Now that it is done, if you choose to ignore this warning, you could
df = df.copy()
or
df.is_copy = None

Categories