Setting values on a copy of a slice from a DataFrame [duplicate] - python

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 4 years ago.
I have a small dataframe, say this one :
Mass32 Mass44
12 0.576703 0.496159
13 0.576658 0.495832
14 0.576703 0.495398
15 0.576587 0.494786
16 0.576616 0.494473
...
I would like to have a rolling mean of column Mass32, so I do this:
x['Mass32s'] = pandas.rolling_mean(x.Mass32, 5).shift(-2)
It works as in I have a new column named Mass32s which contains what I expect it to contain but I also get the warning message:
A value is trying to be set on a copy of a slice from a DataFrame. Try
using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
I'm wondering if there's a better way to do it, notably to avoid getting this warning message.

This warning comes because your dataframe x is a copy of a slice. This is not easy to know why, but it has something to do with how you have come to the current state of it.
You can either create a proper dataframe out of x by doing
x = x.copy()
This will remove the warning, but it is not the proper way
You should be using the DataFrame.loc method, as the warning suggests, like this:
x.loc[:,'Mass32s'] = pandas.rolling_mean(x.Mass32, 5).shift(-2)

Related

SettingWithCopyWarning: Cannot solve [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 2 years ago.
I have read some threads but I can not solve the problem.
I am using this code:
data_new=dataset_circulos[['x_tipif','y_tipif']]
A=cluster.KMeans(n_clusters=2).fit(data_new[['x_tipif','y_tipif']])
predicciones=A.predict(pd.DataFrame(data_new[['x_tipif','y_tipif']]))
data_new.loc[ : , 'predicciones'] = predicciones
centroides=A.cluster_centers_
sns.pairplot(x_vars='x_tipif', y_vars='y_tipif', data=data_new, hue="predicciones")
The application shows me the message below:
C:\Users\USER-PC\Anaconda3\lib\site-packages\pandas\core\indexing.py:376: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
How can I fix it?
To solve this warning, the line
data_new=dataset_circulos[['x_tipif','y_tipif']]
should instead by written as:
data_new = dataset_circulos[['x_tipif','y_tipif']].copy()
to make an explicit copy of the slice of dataset_circulos.
Given that data_new only contains two columns, 'x_tipif' and 'y_tipif', your later indexing in
A=cluster.KMeans(n_clusters=2).fit(data_new[['x_tipif','y_tipif']])
is redundant. This could be more simply written as
A=cluster.KMeans(n_clusters=2).fit(data_new)

What does SettingWithCopyWarning imply [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 3 years ago.
I tried renaming an index value with the code:
df_1.rename({'*****': "Favour Edwards"}, axis = 0, inplace = True)
and I got this message:
/home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/pandas/core/frame.py:4238: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
return super().rename(**kwargs)
Although my code still ran without any observable errors, i still checked out the link but unfortunately i'm still pretty new with coding and the vocabulary/ semantics of the documentation seemed sort of complex for me to really understand. Anyone who can break down the meaning in simpler terms?
Its a WARNING not ERROR (big distinction) that your operation may not have worked as expected and that you should check the results.

creating a new column in data frame [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 4 years ago.
I have a small dataframe, say this one :
Mass32 Mass44
12 0.576703 0.496159
13 0.576658 0.495832
14 0.576703 0.495398
15 0.576587 0.494786
16 0.576616 0.494473
...
I would like to have a rolling mean of column Mass32, so I do this:
x['Mass32s'] = pandas.rolling_mean(x.Mass32, 5).shift(-2)
It works as in I have a new column named Mass32s which contains what I expect it to contain but I also get the warning message:
A value is trying to be set on a copy of a slice from a DataFrame. Try
using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
I'm wondering if there's a better way to do it, notably to avoid getting this warning message.
This warning comes because your dataframe x is a copy of a slice. This is not easy to know why, but it has something to do with how you have come to the current state of it.
You can either create a proper dataframe out of x by doing
x = x.copy()
This will remove the warning, but it is not the proper way
You should be using the DataFrame.loc method, as the warning suggests, like this:
x.loc[:,'Mass32s'] = pandas.rolling_mean(x.Mass32, 5).shift(-2)

Pandas map to a new column, SettingWithCopyWarning [duplicate]

This question already has an answer here:
df.loc causes a SettingWithCopyWarning warning message
(1 answer)
Closed 6 years ago.
In pandas data frame, I'm trying to map df['old_column'], apply user defined function f for each row and create a new column.
df['new_column'] = df['old_column'].map(lambda x: f(x))
This will give out "SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame." error.
I tried the following:
df.loc[:, 'new_column'] = df['old_column'].map(lambda x: f(x))
which doesn't help. What can I do?
A SettingWithCopy warning is raised for certain operations in pandas which may not have the expected result because they may be acting on copies rather than the original datasets. Unfortunately there is no easy way for pandas itself to tell whether or not a particular call will or won't do this, so this warning tends to be raised in many, many cases where (from my perspective as a user) nothing is actually amiss.
Both of your method calls are fine. If you want to get rid of the warning entirely, you can specify:
pd.options.mode.chained_assignment = None
See this StackOverflow Q&A for more information on this.

Getting SettingWithCopyWarning warning even after using .loc in pandas [duplicate]

This question already has answers here:
Pandas still getting SettingWithCopyWarning even after using .loc
(3 answers)
Closed 6 years ago.
df_masked.loc[:, col] = df_masked.groupby([df_masked.index.month, df_masked.index.day])[col].\
transform(lambda y: y.fillna(y.median()))
Even after using a .loc, I get the foll. error, how do I fix it?
Anaconda\lib\site-packages\pandas\core\indexing.py:476: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self.obj[item] = s
You could get this UserWarning if df_masked is a sub-DataFrame of some other DataFrame.
In particular, if data had been copied from the original DataFrame to df_masked then, Pandas emits the UserWarning to alert you that modifying df_masked will not affect the original DataFrame.
If you do not intend to modify the original DataFrame, then you are free to ignore the UserWarning.
There are ways to shut off the UserWarning on a per-statement basis. In particular, you could use df_masked.is_copy = False.
If you run into this UserWarning a lot, then instead of silencing the UserWarnings one-by-one, I think it is better to leave them be as you are developing your code. Be aware of what the UserWarning means, and if the modifying-the-child-does-not-affect-the-parent issue does not affect you, then ignore it. When your code is ready for production, or if you are experienced enough to not need the warnings, shut them off entirely with
pd.options.mode.chained_assignment = None
near the top of your code.
Here is a simple example which demonstrate the problem and (a) solution:
import pandas as pd
df = pd.DataFrame({'swallow':['African','European'], 'cheese':['gouda', 'cheddar']})
df_masked = df.iloc[1:]
df_masked.is_copy = False # comment-out this line to see the UserWarning
df_masked.loc[:, 'swallow'] = 'forest'
The reason why the UserWarning exists is to help alert new users to the fact that
chained-indexing such as
df.iloc[1:].loc[:, 'swallow'] = 'forest'
will not affect df when the result of the first indexer (e.g. df.iloc[1:])
returns a copy.

Categories