df.loc causes a SettingWithCopyWarning warning message

df.loc causes a SettingWithCopyWarning warning message - python

The following line of my code causes a warning :
import pandas as pd
s = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
s.loc[-1] = [5,np.nan,np.nan,6]
grouped = s.groupby(['A'])
for key_m, group_m in grouped:
group_m.loc[-1] = [10,np.nan,np.nan,10]
C:\Anaconda3\lib\site-packages\ipykernel\__main__.py:10: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
According to the documentation this is the recommended way of doing, so what is happening ?
Thanks for your help.

The documentation is slightly confusing.
Your dataframe is a copy of another dataframe. You can verify this by running bool(df.is_copy) You are getting the warning because you are trying to assign to this copy.
The warning/documentation is telling you how you should have constructed df in the first place. Not how you should assign to it now that it is a copy.
df = some_other_df[cols]
will make df a copy of some_other_df. The warning suggests doing this instead
df = some_other_df.loc[:, [cols]]
Now that it is done, if you choose to ignore this warning, you could
df = df.copy()
or
df.is_copy = None

Related

Error when try adding a column to a dataframe

I am trying first to slice a some columns from original dataframe and then add the additional column 'INDEX' to the last column.
df = df.iloc[:, np.r_[10:17]] #col 0~6
df['INDEX'] = df.index #col 7
I have the error message of second line saying 'A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead'
Why am I seeing this and how should I solve it?

I would do
df.loc[:,'INDEX'] = df.index

by default Python does shallow copy of dataframe. So whatever operations are performed on dataframe, it will actually performed on originall data frame. and the message is exactly indicates that.
Either of below will make the Python interpreter happy 😃 :
df = df.iloc[:, np.r_[10:17]].copy()
or
df.loc[:, ['INDEX']] = df.index

Getting rid of Error from Fillna in pandas

I tried filling the NA values of a column in a dataframe with:
df1 = data.copy()
df1.columns = data.columns.str.lower()
df2 = df1[['passangerid', 'trip_cost','class']]
df2['class'] = df2['class'].fillna(0)
df2
Although getting this error:
:5: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation:
df2['class'] = df2['class'].fillna(0, axis = 0)
Can someone please help?

First of all I'd advise you to follow the warning message and read up on the caveats in the provided link.
You're getting this warning (not an error) because your df2 is a slice of your df1, not a separate DataFrame.
To avoid getting this warning you can use .copy() method as:
df2 = df1[['passangerid', 'trip_cost','class']].copy()

Why do I get "a value is trying to be set on a copy of a slice" when creating a new column with apply?

I have something like this,
df1 = ...
df1['NEW_COLUMN'] = df1['SOME_COLUMN'].apply(lambda x: ...)
Although this works and I get the column 'NEW_COLUMN' added to the dataframe, I get this following annying warning. Why? And what is the solution?
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

If you simply want to avoid getting warned, you can set it in pandas options. If you understand why the warning is, and why is it happening then you can simply ignore it by adding this after importing pandas:
pd.options.mode.chained_assignment = None

Add copy() to avoid getting this warning
df = pd.DataFrame({"Value" : [0.12,0.22,0.32,0.11,0.54,0.55,0.98]})
df['Category'] = df.Value.apply(lambda x: 'Neg' if x < 0.5 else 'Pos').copy()

Efficient way to empty a pandas dataframe using its mutable alias property

So I want to create a function in which a part of the codes modifies an existing pandas dataframe df and under some conditions, the df will be modified to empty. The challenge is that this function is now allwoed to return the dataframe itself; it can only modify the df by handling the alias. An example of this is the following function:
import pandas as pd
import random
def random_df_modifier(df):
letter_lst = list('abc')
message_lst = [f'random {i}' for i in range(len(letter_lst) - 1)] + ['BOOM']
chosen_tup = random.choice(list(zip(letter_lst, message_lst)))
df[chosen_tup[0]] = chosen_tup[1]
if chosen_tup[0] == letter_lst[-1]:
print('Game over')
df = pd.DataFrame()#<--this line won't work as intended
return chosen_tup
testing_df = pd.DataFrame({'col1': [True, False]})
print(random_df_modifier(testing_df))
I am aware of the reason df = pd.DataFrame() won't work is because the local df is now associated with the pd.DataFrame() instead of the mutable alias of the input dataframe. so is there any way to change the df inplace to an empty dataframe?
Thank you in advance
EDIT1: df.drop(df.index, inplace=True) seems to work as intended, but I am not sure about its efficientcy because df.drop() may suffer from performance issue
when the dataframe is big enough(by big enough I mean 1mil+ total entries).

df = pd.DataFrame(columns=df.columns)
will empty a dataframe in pandas (and be way faster than using the drop method).
I believe that is what your asking.

pandas dataframe copy of slice warning

I'm fairly new to pandas, and was getting the infamous SettingWithCopyWarning in a large piece of code. I boiled it down to the following:
import pandas as pd
df = pd.DataFrame([[0,3],[3,3],[3,1],[1,1]], columns=list('AB'))
df
df = df.loc[(df.A>1) & (df.B>1)]
df['B'] = 10
When I run this I get the warning:
main:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
The strange thing is that if I leave off the "df" line it runs without a warning. Is this intended behavior?
In general, if I want to filter a DataFrame by the values across various columns, do I need to do a copy() to avoid the SettingWithCopyWarning?
thanks very much

Assuming your DataFrame as below from your question, this will avoid SettingWithCopyWarning
There is github Discussion and solution suggested by one of the Pandas developer Jeff :)
df
A B
1 3 3
Best to use this way.
df['B'] = df['B'].replace(3, 10)
df
A B
1 3 10

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

df.loc causes a SettingWithCopyWarning warning message - python

Related

Error when try adding a column to a dataframe

Getting rid of Error from Fillna in pandas

Why do I get "a value is trying to be set on a copy of a slice" when creating a new column with apply?

Efficient way to empty a pandas dataframe using its mutable alias property

pandas dataframe copy of slice warning

Categories

Resources