I have a dataframe that looks like this:
df1:
The other one with values is like this:
df2:
I want to upate df1 values with df2 and the desired result is:
I don't know if it matter but df1 has more columns than what i showed here.
I tried some solutions using unstack, join and melt, but couldn't make them work.
What is the best way to do this?
Related
I'm trying to pivot my df from wide to long, and I am attempting to replicate R's dplyr::pivot_longer() function. I have tried pd.wide_to_long() and pd.melt() but have had no success in correctly formatting the df. I also attempted using df.pivot() and come to the same conclusion.
Here is what a subset of the df (called df_wide) looks like: Rows are Store Numbers, Columns are Dates, Values are Total Sales
My current function looks like this:
df_wide.pivot(index = df_wide.index,
columns = ["Store", "Date", "Value"], # Output Col Names
values = df_wide.values)
My desired output is a df that looks like this:
Note - this question is distinct from merging, as it is looking at changing the structure of a single data frame
The stack() function is useful to achieve your objective, then reformat as needed:
pd.DataFrame( df.stack() ).reset_index(drop=False).rename(columns={'level_0':'store', 'level_1':'Date', 0:'Value'})
I have 2 dataframes. One has a bunch of columns including f_uuid. The other dataframe has 2 columns, f_uuid and i_uuid.
the first dataframe may contain some f_uuids that the second dataframe doesn't and vice versa.
I want the first dataframe to have a new column i_uuid (from the second dataframe) populated with the appropriate values for the matching f_uuid in that first dataframe.
How would I achieve this?
df1 = pd.merge(df1,
df2,
on='f_uuid')
If you want to keep all f_uuid from df1 (e.g. those not available in df2), you may run
df1 = pd.merge(df1,
df2,
on='f_uuid',
how='left')
I think what your looking for is a merge : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html?highlight=merge#pandas.DataFrame.merge
In your case, that would look like :
bunch_of_col_df.merge(other_df, on="f_uuid")
I have a scenario where I want to find non-matching rows between two dataframes. Both dataframes will have around 30 columns and an id column that uniquely identify each record/row. So, I want to check if a row in df1 is different from the one in df2. The df1 is an updated dataframe and df2 is the previous version.
I have tried an approach pd.concat([df1, df2]).drop_duplicates(keep=False) , but it just combines both dataframes. Is there a way to do it. I would really appreciate the help.
The sample data looks like this for both dfs.
id user_id type status
There will be total 39 columns which may have NULL values in them.
Thanks.
P.S. df2 will always be a subset of df1.
If your df1 and df2 has the same shape, you may easily compare with this code.
df3 = pd.DataFrame(np.where(df1==df2,True,False), columns=df1.columns)
And you will see boolean output "False" for not matching cell value.
I have a problem.
I have made 3 queries to 3 different tables on a database where the data is similar and stores the values on 3 different dataframes.
My question is: Is there any way to make a new data frame where the column is a Dataframe?
Like this image
https://imgur.com/pATNi80
Thank you!
I do not know what exactly you need but you can try this:-
pd.DataFrame([d["col_name"] for d in df])
Where df is the dataframe as shown in image, col_name is the column name which you want as a separate dataframe.
Thank you to jezrael for the answer.
df = pd.concat([df1, df2, df3], axis=1, keys=('df1','df2','df3'))
Does anyone know of an efficient way to create a new dataframe based off of two dataframes in Python/Pandas?
What I am trying to do is check if a value from df1 is in df2, then do not add the row to df3. I am working with student IDS, and if a student ID from df1 is in df2, I do not want to include it in the new dataframe, df3.
So does anybody know an efficient way to do this? I have googled and looked on SO, but found nothing that works so far.
Assuming the column is called ID.
df3 = df1[~df1["ID"].isin(df2["ID"])].copy()
If you have both dataframes of same length you can also use:
print df1.loc[df1['ID'] != df2['ID']]
assign it to a third dataframe.