Replacing dataframe column list values with values from another dataframe - python

I am trying to replace data in one of the dataframe while comparing different columns with another data frame with values like below.
I need to map the 'members' column in df1 with 'uid' column in df2 and get the corresponding ipv4-address for one member.
Dataframe 1:
uid
members
type
42
afea136c-217f-4b1d-857c-dc4075bxxxxx
[08xx-b8xx- 4bcf-8axx-5f86xxxxxx, 64xx5c4..
group
Dataframe 2:
uid
name
ipv4-address
type
506
08xx-b8xx- 4bcf-8axx-5f86xxxxxx
l_re-exx-xx-xx-19.172.211.0m23
19.172.211.0
network
589
64xx5c4..
l_re-exx-xx-xx-19.172.211.0m23
19.152.210.0
network
is it possible to replace the members column values or jusr create a new column in df1 with ipv4-addresses from df2?
expected outcome:
uid
members
type
42
afea136c-217f-4b1d-857c-dc4075bxxxxx
[19.172.211.0, 19.152.210.0,..]
group

If you are filtering just to the rows you need on df1 and df2, you can do
ips = df2['ipv4-address'].tolist()
and then set
df1['members'] = ips
otherwise you'll have to use a little more logic to get the right rows to update

Let us try explode then map
df1['new'] = df1.member.explode().map(df2.set_index('name')['ipv4-address']).groupby(level=0).agg(list)

Related

Merge Dataframes using List of Columns (Pandas Vlookup)

I'd like to lookup several columns from another dataframe that I have in a list to bring them over to my main dataframe, essentially doing a "v-lookup" of ~30 columns using ID as the key or lookup value for all columns.
However, for the columns that are the same between the two dataframes, I don't want to bring over the duplicate columns but have those values be filled in df1 from df2.
I've tried below:
df = pd.merge(df,df2[['ID', [look_up_cols]]] ,
on ='ID',
how ='left',
#suffixes=(False,False)
)
but it brings in the shared columns from df2 when I want df2's values filled into the same columns in df1.
I've tried also created a dictionary with the column pairs from each df and doing this for loop to lookup each item in the dictionary (lookup_map) in the other df using ID as the key:
for col in look_up_cols:
df1[col] = df2['ID'].map(lookup_map)
but this just returns NaNs.
You should be able to do something like the following:
df = pd.merge(df,df2[look_up_cols + ['ID']] ,
on ='ID',
how ='left')
This just adds the ID column to the look_up_cols list and thereby allows it to be used in the merge function

Averaging data of dataframe columns based on redundancy of another column

I want to average the data of one column in a pandas dataframe is they share the same 'id' which is stored in another column in the same dataframe. To make it simple i have:
and i want:
Were is clear that 'nx' and 'ny' columns' elements have been averaged if for them the value of 'nodes' was the same. The column 'maille' on the other hand has to remain untouched.
I'm trying with groupby but couldn't manage till now to keep the column 'maille' as it is.
Any idea?
Use GroupBy.transform with specify columns names in list for aggregates and assign back:
cols = ['nx','ny']
df[cols] = df.groupby('nodes')[cols].transform('mean')
print (df)
Another idea with DataFrame.update:
df.update(df.groupby('nodes')[cols].transform('mean'))
print (df)

Expand column values from a grouped-by DataFrame into proper columns

After a GroupBy operation I have the following DataFrame:
The user_id is grouped with their respective aisle_id as I want. Now, I want to turn the aisle_id values into columns, having the user_id as a index, and all the aisle_id as columns. Then, in the values I want to have the amount of times the user_id and aisle_id have matched in the previous DataSet. For example, if the user_id 1 has bought from the aisle_id 12 in 3 occasions, the value in DF[1,12] would be 3.
With Pandas pivot tables I can get the template of the user_id as index, and the aisle_id as columns, but I can't seem to find the way to create the values specified above.
considering your first dataframe is df, I think you could try this :
df2=pd.DataFrame(index=df['user_id'].unique(),columns=df['aisle_id'].unique())
for i in df2.index :
for j in df2.columns :
df2.at[i,j]=len(df.loc[(df['user_id']==i) & (df['aisle_id']==j)])

How can I add new rows from a dataframe to another one based on key column

My df1 is something like first table in the below image with the key column being Name. I want to add new rows from another dataframe, df2, which has only Name, Year, and Value columns. The new rows should get added based on Name. Other columns would just repeat the same value per Name. Results should be similar to the second table in the below image. How can I do this in pandas ?
Create a sub table df3 of df1 consist of Group, Name, and Other and only keep distinct records. And left join df2 and df3 to get desired result.

fill blank values of one dataframe with the values of another dataframe based on conditions-pandas

I have above dataframe df,and I have following dataframe as df2
I want to fill missing values in df with values in df2 corresponding to the id.
Also for Player1,Player2,Player3.If the value is missing.I want to replace Player1,Player2,Player3 of df with the corresponding values of df2.
Thus the resultant dataframe would look like this
Notice.Rick,Scott,Andrew are still forward as they are in df.I just replaced players in df with the corresponding players in df2.
So far,I have attempted to fill the blank values in df with the values in df2.
df=pd.read_csv('file1.csv')
for s in list(df.columns[1:]):
df[s]=df[s].str.strip()
df.fillna('',inplace=True)
df.replace(r'',np.nan,regex=True,inplace=True)
df2=pd.read_csv('file2.csv')
for s in list(df2.columns[1:]):
df2[s]=df2[s].str.strip()
df.set_index('Team ID',inplace=True)
df2.set_index('Team ID',inplace=True)
df.fillna(df2,inplace=True)
df.reset_index(inplace=True)
I am getting above result.How can I get result in Image Number 3?
Using combine_first
df1=df1.set_index('Team ID')
df2=df2.set_index('Team ID')
df2=df2.combine_first(df1.replace('',np.nan)).reset_index()

Categories