multiplication of two columns in dataframe SettingWithCopyWarning - python

I`ve a large dataframe, Im trying to do a simple multipication between two columns and put the results in new column when I do that I'm getting this error message :
SettingWithCopyWarning : a value is trying to be set on a copy of a slice from a dataframe.
my code looks like this :
DF[‘mult‘]=DF[‘price‘]*DF[‘rate‘]
I Tried loc but didnt work .. does anyone have a solution ?

You should use df.assign() in this case:
df2 = DF.assign(mult=DF[‘price‘]*DF[‘rate‘])
You get back a new dataframe with a 'mult' column added.

Related

SnowparkFetchDataException: (1406): Failed to fetch a Pandas Dataframe. The error is: Found non-unique column index

While running some code like this:
session = ...
return session.table([DB,SCHEMA, MANUAL_METRICS_BY_SIZE]).select("TECHNOLOGY","OBJECTTYPE","OBJECTTYPE","SIZE","EFFORT").to_pandas()
I got this error.
Any idea of what might be causing this?
Well it was easier that what I thought.
I had a duplicated column name and pandas doesn't like that.
Just check your columns. For example with df.columns and remove the duplicated column

Unable to update new column values in rows which were derived from existing column having multiple values separeted by ','?

Original dataframe
Converted Dataframe using stack and split:
Adding new column to a converted dataframe:
What i am trying to is add a new column using np.select(condition, values) but it not updating the two addition rows derived from H1 its returning with 0 or NAN. Can someone please help me here ?
Please note i have already done the reset index but still its not helping.
I think using numpy in this situation is kind of unnecessary.
you can use something like the following code:
df[df.State == 'CT']['H3'] = 4400000

Changing Values in multiindex pandas dataframe

I have loaded a multiindex matrix from excel to a panda dataframe.
df=pd.read_excel("filename.xlsx",sheet_name=sheet,header=1,index_col=[0,1,2],usecols="B:AZ")
The dataframe has three index colums and one header, so it has 4 indices. Part of the dataframe looks like this:
When I want to show a particular value, it works like this:
df[index1][index2][index3][index4]
I now want to change certain values in the dataframe. After searching through different forums for a while, it seemed like df.at would be the right method to do this. So I tried this:
df.at[index1,index2,index3,index4]=10 (or any random value)
but I got a ValueError: Not enough indexers for scalar access (setting)!. I also tried different orders for the indices inside the df.at-Brackets.
Any help on this would be much appreciated!
It seems you need something like that:
df.loc[('Kosten','LKW','Sofia'),'Ruse']=10

Question on how to create a new column based on current df columns

I am trying to figure out how to create a new column for my df and cant seem to get it to work with what I have tried.
I have tried using
loans_df.insert("Debt_Ratio",["MonthlyDebt"*12/"Income"])
but I keep getting an error stating unsupported operand type.
BTW I am calculating the new column using already predefined columns in my df
loans_df.insert("Debt_Ratio",["MonthlyDebt"*12/"Income"])
My expected results would be that this new column is inserted into the df with the specific calculation defining it.
Hope this all makes sense!
Considering that dataframe is loans_df and that the column you want to create is named Debt_Ratio, the following will do the work
loans_df['Debt_Ratio'] = loans_df['MonthlyDebt'] * 12/loans_df['Income']

accessing data in multiiindexed pandas dataframe

Grouping by two columns and creating a DataFrame from the result gives me this multiindex table. I didn't manage to access an element from it as described in the documentation. The access fails with KeyError: ('110166987', 'Direct Mail'). What am I doing wrong here?
As a second question, can I somehow pivot this DataFrame so that the second index variable "Channel" becomes the columns?
Let's try:
df.loc[('110166987','Direct Mail')]

Categories