I am trying first to slice a some columns from original dataframe and then add the additional column 'INDEX' to the last column.
df = df.iloc[:, np.r_[10:17]] #col 0~6
df['INDEX'] = df.index #col 7
I have the error message of second line saying 'A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead'
Why am I seeing this and how should I solve it?
I would do
df.loc[:,'INDEX'] = df.index
by default Python does shallow copy of dataframe. So whatever operations are performed on dataframe, it will actually performed on originall data frame. and the message is exactly indicates that.
Either of below will make the Python interpreter happy 😃 :
df = df.iloc[:, np.r_[10:17]].copy()
or
df.loc[:, ['INDEX']] = df.index
Related
I am using pandas to make a dataframe. I want to delete 12 initial rows by drop function. every resources website says that you should use drop to delete the rows unfortunately it doesn't work. I don't know why. the error says that 'list' object has no attribute 'drop' could you do me a favor and find it what should I do?
url=Exp01.html
url=str(url)
df = pd.read_html(url)
df = df.drop(index=['1','12'],axis=0,inplace=True)
print(df)
You can slice the rows out:
df = df.loc[11:]
df
loc in general is configured this way:
df.loc[x:y]
where x is the starting index and y is the ending index.
[11:] gives starting index as 11 and no ending index
Pandas read_html returns a list of dataframes.
So df is a list on your example. First, take a look at what the list holds.
If it's just one table (dataframe), you can change it to:
df = pd.read_html(url)[0]
Full code:
url=Exp01.html
url=str(url)
df = pd.read_html(url)[0]
df.drop(index=df.index[:12], axis=0, inplace=True)
I tried filling the NA values of a column in a dataframe with:
df1 = data.copy()
df1.columns = data.columns.str.lower()
df2 = df1[['passangerid', 'trip_cost','class']]
df2['class'] = df2['class'].fillna(0)
df2
Although getting this error:
:5: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation:
df2['class'] = df2['class'].fillna(0, axis = 0)
Can someone please help?
First of all I'd advise you to follow the warning message and read up on the caveats in the provided link.
You're getting this warning (not an error) because your df2 is a slice of your df1, not a separate DataFrame.
To avoid getting this warning you can use .copy() method as:
df2 = df1[['passangerid', 'trip_cost','class']].copy()
I have a dataframe where three of the columns are coordinates of data ('H_x', 'H_y' and 'H_z'). I want to calculate radius-vector of the data and add it as a new column in my dataframe. But I have some kind of problem with pandas apply function.
My code is:
def radvec(x, y, z):
rv=np.sqrt(x**2+y**2+z**2)
return rv
halo_field['rh_field']=halo_field.apply(lambda row: radvec(row['H_x'], row['H_y'], row['H_z']), axis=1)
The error I'm getting is:
group_sh.py:78: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-
docs/stable/indexing.html#indexing-view-versus-copy
halo_field['rh_field']=halo_field.apply(lambda row: radvec(row['H_x'], row['H_y'], row['H_z']), axis=1)
I get column that I want, but I'm still confused with this error message.
I'm aware there are similar questions here, but I couldn't find how to solve my problem. I'm fairly new to python. Can you help?
Edit: halo_field is a slice of another dataframe:
halo_field = halo_res[halo_res.N_subs==1]
The problem is you're working with a slice, which can be ambiguous:
halo_field = halo_res[halo_res.N_subs==1]
You have two options:
Work on a copy
You can explicitly copy your dataframe to avoid the warning and ensure your original dataframe is unaffected:
halo_field = halo_res[halo_res.N_subs==1].copy()
halo_field['rh_field'] = halo_field.apply(...)
Work on the original dataframe conditionally
Use pd.DataFrame.loc with a Boolean mask to update your original dataframe:
mask = halo_res['N_subs'] == 1
halo_res.loc[mask, 'rh_field'] = halo_res.loc[mask, 'rh_field'].apply(...)
Don't use apply
As a side note, in either scenario you can avoid apply for your function. For example:
halo_field['rh_field'] = (halo_field[['H_x', 'H_y', 'H_z']]**2).sum(1)**0.5
trying to lower and strip a column in python 3 using panda, but getting the warning-- what is the right way so this warning will not come up
df["col1"] = df[["col1"]].apply(lambda x: x.str.strip())
df["col1"] = df[["col1"]].apply(lambda x: x.str.lower())
The warning
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self[k1] = value[k2]
how to remove the warning
To get rid of this warning apply it to a series instead of a dataframe. Using df[["col1"]] is creating a new dataframe that you are then setting to the column. If you instead just modify the column it'll be fine. Additionally, I chained the two together.
df["col1"] = df["col1"].str.strip().str.lower()
Trying to assign a date to a column in a DataFrame.
Assigning in the following way gives an error
for date in sorted(list(set(dates))):
df.loc[:, 'DATE'] = date
Error Cannot set a frame with no defined index and a scalar
Okay, fine:
for date in sorted(list(set(dates))):
df['DATE'] = date
Warning: A value is truing to be set on a copy of a slice from a DataFrame, try using .loc ...
What is it exactly that python prefers I do to not avoid an Error with a Warning instead?
Many thanks!
if you are sure that len(sorted(list(set(dates)))) == len(df) then you can simply do:
df['DATE'] = sorted(list(set(dates)))