Getting SettingWithCopyWarning in spite of using .loc method on a dataframe [duplicate] - python

This question already has answers here:
Pandas still getting SettingWithCopyWarning even after using .loc
(3 answers)
Closed 6 years ago.
I'm trying to modify a single "cell" in a dataframe. Now, modification works, but I get this warning:
In [131]: df.loc[df['Access date'] == '06/01/2016 00:35:34', 'Title'] = 'XXXXXXXX'
ipython:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
Per Pandas: Replacing column values in dataframe I am using .loc method, yet I get this warning (I don't see a copy of dataframe that I'm supposedly modifying anywhere here)
Should this warning happen here? If not, how do I disable it?
UPDATE
It seems that df is a (weakref) copy of another dataframe (checked with .is_copy).

That link in the warning addresses the issue in detail under the section: Why does assignment fail when using chained indexing?
Summary of the section: pandas makes no guarantee on the memory handling of arrays in certain situations so the warning is there, even with certain implementations of .loc, to tell you that this could be wildly inefficient.
To turn off warnings, you can use the warnings library and execute the following code in one of your ipython notebook cells.
import warnings
warnings.catch_warnings()
warnings.simplefilter("ignore")

Related

Pandas creating a column which counts the length of a previous column entries without getting a Set Copy Warning [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 1 year ago.
When I look at the answer to a similiar question as shown in this link: Pandas: adding column with the length of other column as value
I come across an issue where the solution its suggesting i.e
df['name_length'] = df['seller_name].str.len()
Throws the following warning
'''
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
'''
My question is: How could this be done inorder to prevent this warning from occuring? As in this command I would like to add a new column to the original dataframe not create some sort of copy of a slice.
I took a sample data set to test this issue in Python 3.8
Sample data
here is the same code which you ran
df['name_length'] = df['seller_name'].str.len()
there was no error

How to assign a new column to a DataFrame without triggering the SettingWithCopyWarning [duplicate]

This question already has answers here:
A value is trying to be set on a copy of a slice from a DataFrame
(2 answers)
Closed 4 years ago.
This question has been asked before in similar examples, however non of the answers I've seen address this particular problem in a satisfactory way (see later).
I have a DataFrame df and one of its columns, df['a'] contains NaN values. I remove the Nan elements, and then try to create a new column:
df = df[~df.isnull()]
df['b'] = False
The above gives me a SettingWithCopyWarning:
/home/user/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py:517: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self.obj[item] = s
However, the error message, and other answers I've seen on stackoverflow, don't seem to have a satisfactory solution. The most common suggestion is df.loc[:,'b'] = False but this still seems to give me the warning.
I also tried:
df['b'] = np.zeros(len(df), dtype=bool)
df.loc[:,'b'] = np.zeros(len(df), dtype=bool)
Yet all still get flagged with Warnings. So what is the correct way to do this, because clearly the warnings imply that I'm doing something wrong? Is there something with the coding practice above that should be avoided? One reason I do the above is in particular to create new columns and lock in their dtype (for example above, I don't want the column to be a float).
You can try something like this:
df = df.assign(b = False)
You can see more details on pandas.DataFrame.assign

Does SettingWithCopyWarning even matter? [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 5 years ago.
I add values to a dataframe entry by entry as followed:
refined_cme_quandl_list['typical_daily_volume']= np.nan
for index, row in refined_cme_quandl_list.iterrows():
refined_cme_quandl_list['typical_daily_volume'][index] = typical_volume[row['Quandl_download_symbol']]
I still get what i want, but i get this warning:
SettingWithCopyWarning: A value is trying to be set on a copy of a
slice from a DataFrame
See the caveats in the documentation:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
Does it matter?
Yes, using boolean indexing directly to assign to slices is not recommended. Use df.loc instead:
refined_cme_quandl_list.loc[index, 'typical_daily_volume'] = \
typical_volume[row['Quandl_download_symbol']]
It is quite possible that future releases of pandas might disable this behaviour (direct indexing), so you don't want your code breaking in the future.

Pandas Setting with copy warning

I am trying to slice a dataframe in pandas and save it to another dataframe. Let's say df is an existing dataframe and I want to slice a certain section of it into df1. However, I am getting this warning:
/usr/local/lib/python3.5/dist-packages/ipykernel_launcher.py:25: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
I have checked various posts in SO discussing the similar issue (two of them are following):
Setting with copy warning
How to deal with SettingWithCopyWarning in Pandas?
Going through these links, I was able to find issue in following line
Years=['2010','2011']
for key in Years:
df1['GG'][key]=df[key][df[key].Consumption_Category=='GG']
which I then changed to following
Years=['2010','2011']
for key in Years:
df1['GG'][key]=df[key].loc[df['2010'].iloc[:,0]=='GG']
and get away with the warning.
However, when I included another line to drop a certain column from this dataframe, I again this got warning which I am unable to dort out.
Years=['2010','2011']
for key in Years:
df1['GG'][key]=df[key].loc[df['2010'].iloc[:,0]=='GG']
df1['GG'][key]=df1['GG'][key].drop(['Consumption_Category'],axis=1,inplace=True)
Finally after lot of research and going through pandas documentation, I found the answer to my question. The warning which I was getting is because I have put inplace=True in the drop() function. So, I removed the inplace=True and saved the result into new datafrmae. Now I do not get any warning.
Years=['2010','2011']
for key in Years:
df1['GG'][key]=df[key].loc[df['2010'].iloc[:,0]=='GG']
df1['GG'][key]=df1['GG'][key].drop(['Consumption_Category'],axis=1)

Pandas SettingWithCopyWarning when using iloc

I'm trying to change values in my DataFrame after merging it with another DataFrame and coming across some issues (doesn't appear to be an issue prior to merging).
I am indexing and changing values in my DataFrame with:
df.iloc[0]['column'] = 1
Subsequently I've joined (left outer join) along both indexes using merge (I realize left.join(right) would work too). After this when I perform the same value assignment using iloc, I receive the following warning:
__main__:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A review of the linked document doesn't clarify the understanding hence, am I using an incorrect method of slicing with iloc? (keeping in mind I require positional based slicing for the purpose of my code)
I notice that df.ix[0,'column'] = 1 works, and similarly based on this page I can reference the column location with df.columns.get_loc('column') but on the surface this seems unnecessarily convoluted.
What's the difference between these methods under the hood, and what about merging causes the previous method (df.iloc[0]['column']) to break?
You are using chained indexing above, this is to be avoided "df.iloc[0]['column'] = 1" and generates the SettingWithCopy Warning you are getting. The Pandas docs are a bit complicated but see SettingWithCopy Warning with chained indexing for the under the hood explanation on why this does not work.
Instead you should use df.loc[0, 'column'] = 1
.loc is for "Access a group of rows and columns by label(s) or a boolean array."
.iloc is for "Purely integer-location based indexing for selection by position."
It sucks, but the best solution I've come so far about updating a dataframe's column based on the .ilocs is find the iloc of a column, then use .iloc for everything:
column_i_loc = np.where(df.columns == 'column')[0][0]
df.iloc[0, column_i_loc] = 1
Note you could also disable the warning, but really do not!...
Also, if you face this warning and were not trying to update some original DataFrame, then you forgot to make a copy and end up with a nasty bug...

Categories