I have a column in my data frame where I have emails and not emails.
with this slice I can only get the fields that are without email:
df[~df['email'].str.contains('#', case=False)]['email']
But when I try to replace it with a value of my preference:
df[~df['email'].str.contains('#', case=False)]['email'] = 'No'
The column does not receive the change.
I don't get any error, just the following warning:
/home/rockstar/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
"""Entry point for launching an IPython kernel.
follows an image of my complete dataframe:
Also, df[~df['email'].str.contains('#',case=False)] = 'No' works perfectly but I end up losing data from the rest of the line
Refer the following example code.
import pandas as pd
df = pd.DataFrame({"E-mail":["abc#de", "abcde"]})
df['E-mail'].loc[~df['E-mail'].str.contains('#', case = False)] = 'No'
Related
I am trying to apply multiple user-defined functions to a column of a dataframe one by one.
This is the error it is giving me when I ran it in Django env.
(.py3env) [root#---------]# python report_feedback_attribute.py runserver
feedback_attribute.py:89: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
data1['feedback_text'] = data1['feedback_text'].apply(clean_text) # the feedback_text column has been cleaned
feedback_attribute.py:93: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
data1['feedback_text'] = data1['feedback_text'].astype(str)
feedback_attribute.py:103: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
data1['feedback_text'] = data1['feedback_text'].apply(lambda x: tokenizer.tokenize(x.lower()))
It is returning an empty dictionary.
{'count': []}
I have already tried creating a copy of my dataframe and then working on it. Still no result.
I have also tried applying .loc in this way:
data1.loc[:, 'feedback_text'] = data1['feedback_text'].apply(clean_text)
(I did this change for every function that I defined and applied on the dataframe).
I also tried creating a new column every time a function was applied to the previous column of a dataframe.
Still no results. Please help.
Edit:
This is data1 (just a sample):
I am using pandas 1.0.1 and I am creating a new column that converts the date column to a datetime column and I am getting the warning below. I tried using data.loc[:, "Datetime"] as well and I still got the same warning. Please how could this be avoided?
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
data["Datetime"] = pd.to_datetime(data["Date"], infer_datetime_format=True)
Most likely you created your source DataFrame as a view of another
DataFrame (only some columns and / or only some rows).
Find in your code the place where your DataFrame is created and append .copy() there.
Then your DataFrame will be created as a fully independent DataFrame (with its
own data buffer) and this warning should not appear any more.
I have code as below.
import pandas as pd
import numpy as np
data = [['Alex',10,5,0],['Bob',12,4,1],['Clarke',13,6,0],['brke',15,1,0]]
df = pd.DataFrame(data,columns=['Name','Age','weight','class'],dtype=float)
df_numeric=df.select_dtypes(include='number')#, exclude=None)[source]
df_non_numeric=df.select_dtypes(exclude='number')
df_non_numeric['class']=df_numeric['class'].copy()
it gives me below message
__main__:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
i want to have df_non_numeric independent from df_numeric
i used df_numeric['class'].copy() based upon suggestions given in other posts.
How could i avoid the message?
I think you need copy because DataFrame.select_dtypes is slicing operation, filtering by types of column, check Question 3:
df_numeric=df.select_dtypes(include='number').copy()
df_non_numeric=df.select_dtypes(exclude='number').copy()
If you modify values in df_non_numeric later you will find that the modifications do not propagate back to the original data (df), and that Pandas does warning.
Using this simple line of code, I keep on getting a SettingWithCopyWarning error that than carries through my whole code.
#make email a string
df['Email Address'] = df['Email Address'].astype(str)
C:\Users\xxx\AppData\Local\Continuum\Anaconda2\lib\site-packages\ipykernel\__main__.py:2: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
from ipykernel import kernelapp as app
I went through the documentation, but can't make it work with loc. The code below is wrong.
df.loc['Email Address'] = df.loc['Email Address'].astype(str)
Please excuse if this is a duplicate question - I searched it on stackoverflow, but couldn't find one that addresses loc and astype.
Your issue isn't with how you are making the assignment. It is with the dataframe prior to assignment. At some point prior to the assignment, you created df in such a way that it became a view into another dataframe. You can verify this with bool(df.is_copy)
If you are ok with df being a separate thing with no linkages to data in other dataframes...
df = df.copy()
Then proceed to make your assignment.
Update 03/21
I believe this is the correct solution with loc
df.loc[:, 'Email Address'].astype(str)
I have the following code:
block_table[[compared_attribute]] = block_table[[compared_attribute]].astype(int)
I want to change the datatype of a column. The code is working, but I get a warning from Python: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self[k1] = value[k2]
I looked into this warning and I was reading it may be creating a copy of the dataframe, instead of just overwriting it, so I tried the following solutions with no luck...
block_table.loc[[compared_attribute]] = block_table[[compared_attribute]].astype(int)
block_table.loc[:,compared_attribute] = block_table[[compared_attribute]].astype(int)
It should be as simple as:
block_table.loc[:,compared_attribute] = block_table[compared_attribute].astype(int)
This is assuming compared attributes is by columns otherwise, switch the colon and compared_attribute in the loc part.
Also quite hard to answer without an example of what the data looks like and what compared_attribute looks like.