Replace column Values based on Index of other Dataframe - python

I am trying to replace the Values in the "All Assortment" column of the "buyer" data frame.
I need to replace them with the data from the "All Stores" column of the "asl" data frame. The twist is that the index values of the asl data frame are the values that need to match for the replacement to work.

Hard to say without a minimal reproducible example, but try mapping the values of buyer['All Assortment'] to corresponding values from the asl['All Stores'] column based on the asl index:
buyer['All Assortment'] = buyer['All Assortment'].map(asl['All Stores'])

Related

Fill in Bellow a group df pandas

I have a data frame that I made the transpose of it looking like this
I would like to know how I can transform this group into filled lines, follow an example below
Where the first column is filled with the first value until the last empty row.
how can i do this if the column is grouped
In your case, repeat the indices of your data frame five times, save them in a new column, and then make the column entries original indices.
ibov_transpose['index'] = ibov_transpose.index.repeat(5)
ibov_transpose.set_index('index')
del(ibov_transpose['index'])

Sorting multiple Pandas Dataframe Columns based on the sorting of one column

I have a dataframe with two columns in it,'item' and 'calories'. I have sorted the 'calories' column numerically using a selection sort algorithm, but i need the 'item' column to change so the calorie value matches the correct item.
menu=pd.read_csv("menu.csv",encoding="utf-8") # Read in the csv file
menu_df=pd.DataFrame(menu,columns=['Item','Calories']) # Creating a dataframe with just the information from the calories column
print(menu_df) # Display un-sorted data
#print(menu_df.at[4,'Calories']) # Practise calling on individual elements within the dataframe.
# Start of selection sort
for outerloopindex in range (len(menu_df)):
smallest_value_index=outerloopindex
for innerloopindex in range(outerloopindex+1,len(menu_df)):
if menu_df.at[smallest_value_index,'Calories']>menu_df.at[innerloopindex,'Calories']:
smallest_value_index=innerloopindex
# Changing the order of the Calorie column.
menu_df.at[outerloopindex,'Calories'],menu_df.at[smallest_value_index,'Calories']=menu_df.at[smallest_value_index,'Calories'],menu_df.at[outerloopindex,'Calories']
# End of selection sort
print(menu_df)
Any help on how to get the 'Item' column to match the corresponding 'Calorie' values after the sort would be really really appreciated.
Thanks
Martin
You can replace df.at[...] with df.loc[...] and refer multiple columns, instead of single one.
So replace line:
menu_df.at[outerloopindex,'Calories'],menu_df.at[smallest_value_index,'Calories']=menu_df.at[smallest_value_index,'Calories'],menu_df.at[outerloopindex,'Calories']
With line:
menu_df.loc[outerloopindex,['Calories','Item']],menu_df.loc[smallest_value_index,['Calories','Item']]=menu_df.loc[smallest_value_index,['Calories','Item']],menu_df.loc[outerloopindex,['Calories','Item']]

transfer string values with an specific word to other columns in a data frame pandas,python

I have the following problem with this example data based in a bigger data with 420 rows.
The columns are "m2", "rooms" , and "toilets". I'm using pandas and python.
I would like to transfer the strings that contain the words 'room','rooms','toilet' and 'toilets' from the column 'm2' and the column 'rooms' to their corresponding columns, in this case 'rooms' and 'toilets'. I have problems trying to transfer this values and keeping the row of this information into the data.
Condition using str.contains
cond1=df.rooms.str.contains('toilet')
cond2=df.m2.str.contains('room')
cond3=df.m2.str.contains('toilet')
Apply np.where(if condition, yes, else-alternative)
df['toilets']=np.where(cond1,df.rooms,df.toilets)
df['rooms']=np.where(cond2,df.m2,df.rooms)
df['toilets']=np.where(cond3,df.m2,df.toilets)

Pandas: Find string in a column and replace them with numbers with incrementing values

I am working on a dataframe with where I have multiple columns and in one of the columns where there are many rows approx more than 1000 rows which contains the string values. Kindly check the below table for more details:
In the above image I want to change the string values in the column Group_Number to number by picking the values from the first column (MasterGroup) and increment by one (01) and want values to be like below:
Also need to verify that if the String is duplicating then instead of giving a new number it replaces with already changed number. For example in the above image ANAYSIM is duplicating and instead of giving a new sequence number I want already given number to repeating string.
Have checked different links but they are focusing on giving values from user:
Pandas DataFrame: replace all values in a column, based on condition
Change one value based on another value in pandas
Conditional Replace Pandas
Any help with achieving the desired outcome is highly appreciated.
We could do cumcount with groupby
s=(df.groupby('MasterGroup').cumcount()+1).mul(10).astype(str)
t=pd.to_datetime(df.Group_number, errors='coerce')
Then we assign
df.loc[t.isnull(), 'Group_number']=df.MasterGroup.astype(str)+s

Cleaning Data: Replacing Current Column Values with Values mapped in Dictionary

I have been trying to wrap my head around this for a while now and have yet to come up with a solution.
My question is how do I change current column values in multiple columns based on the column name if criteria is met???
I have survey data which has been read in as a pandas csv dataframe:
import pandas as pd
df = pd.read_csv("survey_data")
I have created a dictionary with column names and the values I want in each column if the current column value is equal to 1. Each column contains 1 or NaN. Basically any column within the data frame ending in '_SA' =5, '_A' =4, '_NO' =3, '_D' =2 and '_SD' stays as the current value 1. All of the 'NaN' values remain as is. This is the dictionary:
op_dict = {
'op_dog_SA':5,
'op_dog_A':4,
'op_dog_NO':3,
'op_dog_D':2,
'op_dog_SD':1,
'op_cat_SA':5,
'op_cat_A':4,
'op_cat_NO':3,
'op_cat_D':2,
'op_cat_SD':1,
'op_fish_SA':5,
'op_fish_A':4,
'op_fish_NO':3,
'op_fish_D':2,
'op_fish__SD':1}
I have also created a list of the columns within the data frame I would like to be changed if the current column value = 1 called [op_cols]. Now I have been trying to use something like this that iterates through the values in those columns and replaces 1 with the mapped value in the dictionary:
for i in df[op_cols]:
if i == 1:
df[op_cols].apply(lambda x: op_dict.get(x,x))
df[op_cols]
It is not spitting out an error but it is not replacing the 1 values with the corresponding value from the dictionary. It remains as 1.
Any advice/suggestions on why this would not work or a more efficient way would be greatly appreciated
So if I understand your question you want to replace all ones in a column with 1,2,3,4,5 depending on the column name?
I think all you need to do is iterate through your list and multiple by the value your dict returns:
for col in op_cols:
df[col] = df[col]*op_dict[col]
This does what you describe and is far faster than replacing every value. NaNs will still be NaNs, you could handle those in the loop with fillna if you like too.

Categories