How to assign Numpy array values to other variable - python

This is my code:
y_predForThisMatchType = model.predict(X_test, num_iteration=model.best_iteration)
print(type(y_predForThisMatchType))
y_predForThisMatchType = y_predForThisMatchType.reshape(-1)
print(type(y_predForThisMatchType))
count = 0
for i in range (len(y_pred)):
if y_pred.loc[i] == abType:
y_pred.loc[i] = y_predForThisMatchType[count]
count = count + 1
Output:
class 'numpy.ndarray'
class 'numpy.ndarray'
/opt/conda/lib/python3.6/site-packages/pandas/core/indexing.py:189: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self._setitem_with_indexer(indexer, value)
Python just print the above output, and that's all. The program is technically running, but below code do not get executed, no real error is shown.
Error Line:
y_pred.loc[i] = y_predForThisMatchType[count]
y_pred variable is a pandas dataframe.

Have you checked your outputs completely?
In my experience, your code is working.
The display is just a warning and can be disabled with:
pandas.options.mode.chained_assignment = None # default='warn'

Related

df.at warning message would like to optimize the code

I am using the following code to add a nuts value into new columns in dfEU dataframe. I came up with this code from the following past
pandas .at versus .loc
Is there a way to solve this warning?
dfEU was created using the following query:
dfEU = df.query('continent == "EU" & country_id == "Belgium"')
for row in dfEU.itertuples():
lati=float(getattr(row, 'locationlatitude'))
longi=float(getattr(row, 'locationlongitude'))
nuts = nf.find(lat=lati, lon=longi)
if nuts:
dfEU.at[row.Index, 'nuts1'] = nuts[0].get('NUTS_NAME')
dfEU.at[row.Index, 'nuts1id'] = nuts[0].get('FID')
dfEU.at[row.Index, 'nuts2'] = nuts[1].get('NUTS_NAME')
dfEU.at[row.Index, 'nuts2id'] = nuts[1].get('FID')
dfEU.at[row.Index, 'nuts3'] = nuts[2].get('NUTS_NAME')
dfEU.at[row.Index, 'nuts3id'] = nuts[2].get('FID')
else:
dfEU.at[row.Index, 'nuts1'] = 'Nan'
When launching the code I receive the following warning:
C:\Users\win\anaconda3\lib\site-packages\pandas\core\indexing.py:1596: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self.obj[key] = _infer_fill_value(value)
C:\Users\win\anaconda3\lib\site-packages\pandas\core\indexing.py:1765: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
isetter(loc, value)

Using loc still rises SettingWithCopyWarning Warning while changing column

I want to filter URLs form text column of my df by filtering all http https like below:
data.loc[:,'text_'] = data['text_'].str.replace(r'\s*https?://\S+(\s+|$)', ' ').str.strip()
I used the loc as advised in other answers but I still keep getting the warning.
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:1: FutureWarning: The default value of regex will change from True to False in a future version.
"""Entry point for launching an IPython kernel.
time: 9.81 s (started: 2022-03-19 06:35:42 +00:00)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexing.py:1773: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._setitem_single_column(ilocs[0], value, pi)
How to do this operation correctly ie. without the warning?
UPDATE:
I've generated data from kaggle dataset:
kaggle datasets download clmentbisaillon/fake-and-real-news-dataset
and then:
true_df.drop_duplicates(keep='first')
fake_df.drop_duplicates(keep='first')
true_df['is_fake'] = 0
fake_df['is_fake'] = 1
news_df = pd.concat([true_df, fake_df])
news_df = news_df.sample(frac=1).reset_index(drop=True)
drop_list = ['subject', 'date']
column_filter = news_df.filter(drop_list)
news_df.drop(column_filter, axis=1)
news_df['text_'] = news_df['title'] + news_df['text']
data = news_df[['text_', 'is_fake']]
Next for the following line:
data.loc[:,'text_'] = data['text_'].str.replace(r'\s*https?://\S+(\s+|$)', ' ').str.strip()
I get that error from the start of the post.
UPDATE 2:
As mentioned by #Riley Adding the
data = data.copy()
Fix the SettingWithCopyWarning however the:
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:1: FutureWarning: The default value of regex will change from True to False in a future version.
"""Entry point for launching an IPython kernel.
Still remains. To fix it change regex=True fo replace:
data.loc[:,'text_'] = data['text_'].str.replace(r'\s*https?://\S+(\s+|$)', ' ', regex=True).str.strip()

Python DataFrame Issue with Warning

I am having trouble finding a solution for SettingWithCopyWarning in Jupyter Notebook. I would appreciate any insight and/or solutions. Thank you in advance.
Code:
matches2['players'] = list(zip(matches2['player_1_name'], matches2['player_2_name']))
g = matches2.groupby('players')
df_list = []
for group, df in g:
df = df[['winner']]
n = df.shape[0]
player_1_h2h = np.zeros(n)
player_2_h2h = np.zeros(n)
p1 = group[0]
p2 = group[1]
for i in range(1,n):
if df.iloc[i-1,0] == p1:
player_1_h2h[i] = player_1_h2h[i-1] + 1
player_2_h2h[i] = player_2_h2h[i-1]
else:
player_1_h2h[i] = player_1_h2h[i-1]
player_2_h2h[i] = player_2_h2h[i-1] + 1
df['player_1_h2h'] = player_1_h2h
df['player_2_h2h'] = player_2_h2h
df_list.append(df)
Error:
<ipython-input-214-d8e04df2295c>:32: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df['player_1_h2h'] = player_1_h2h
<ipython-input-214-d8e04df2295c>:33: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df['player_2_h2h'] = player_2_h2h
I would recomend disabling the warning.
import pandas as pd
pd.options.mode.chained_assignment = None
For more information on this behavior see this question and search for the Garrett's answer
You can ignore this warning, as it's a false positive in this case, but if you want to avoid it entirely, you can change
df['player_1_h2h'] = player_1_h2h
df['player_2_h2h'] = player_2_h2h
to
df = df.assign(
player_1_h2h=player_1_h2h,
player_2_h2h=player_2_h2h
)

Pandas SettingWithCopyWarning over re-ordering column's categorical values [duplicate]

Jupiter nootbook is returning this warning:
*C:\anaconda\lib\site-packages\pandas\core\indexing.py:337: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self.obj[key] = _infer_fill_value(value)
C:\anaconda\lib\site-packages\pandas\core\indexing.py:517: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self.obj[item] = s*
After runing the following code:
def group_df(df,num):
ln = len(df)
rang = np.arange(ln)
splt = np.array_split(rang,num)
lst = []
finel_lst = []
for i,x in enumerate(splt):
lst.append([i for x in range(len(x))])
for k in lst:
for j in k:
finel_lst.append(j)
df['group'] = finel_lst
return df
def KNN(dafra,folds,K,fi,target):
df = group_df(dafra,folds)
avarge_e = []
for i in range(folds):
train = df.loc[df['group'] != i]
test = df.loc[df['group'] == i]
test.loc[:,'pred_price'] = np.nan
test.loc[:,'rmse'] = np.nan
print(test.columns)
KNN(data,5,5,'GrLivArea','SalePrice')
In the error message, it is recommended to use .loc indexing- which i did, but it did not help. Please help me- what is the problem ? I have went through the related questions and read the documentation, but i still don't get it.
I think you need copy:
train = df.loc[df['group'] != i].copy()
test = df.loc[df['group'] == i].copy()
If you modify values in test later you will find that the modifications do not propagate back to the original data (df), and that Pandas does warning.

'A value is trying to be set on a copy of a slice from a DataFrame' error while using 'iloc'

Jupiter nootbook is returning this warning:
*C:\anaconda\lib\site-packages\pandas\core\indexing.py:337: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self.obj[key] = _infer_fill_value(value)
C:\anaconda\lib\site-packages\pandas\core\indexing.py:517: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self.obj[item] = s*
After runing the following code:
def group_df(df,num):
ln = len(df)
rang = np.arange(ln)
splt = np.array_split(rang,num)
lst = []
finel_lst = []
for i,x in enumerate(splt):
lst.append([i for x in range(len(x))])
for k in lst:
for j in k:
finel_lst.append(j)
df['group'] = finel_lst
return df
def KNN(dafra,folds,K,fi,target):
df = group_df(dafra,folds)
avarge_e = []
for i in range(folds):
train = df.loc[df['group'] != i]
test = df.loc[df['group'] == i]
test.loc[:,'pred_price'] = np.nan
test.loc[:,'rmse'] = np.nan
print(test.columns)
KNN(data,5,5,'GrLivArea','SalePrice')
In the error message, it is recommended to use .loc indexing- which i did, but it did not help. Please help me- what is the problem ? I have went through the related questions and read the documentation, but i still don't get it.
I think you need copy:
train = df.loc[df['group'] != i].copy()
test = df.loc[df['group'] == i].copy()
If you modify values in test later you will find that the modifications do not propagate back to the original data (df), and that Pandas does warning.

Categories