Pandas DataFrame pivot two columns [duplicate] - python

This question already has answers here:
How do I transpose dataframe in pandas without index?
(3 answers)
Closed 1 year ago.
I have the following DataFrame df
value
type
one
1
two
2
three
3
which I want to reshape such that the desired output would look like that
one
two
three
1
2
3
I used
df.pivot(columns="values", values="type")
which gave me this:
one
two
three
1
nan
nan
nan
2
nan
nan
nan
3
How can I get around the redundancies?

You don't need to pivot the data, you can .Transpose it:
df.set_index('value').T
Out[22]:
value one two three
type 1 2 3

Related

Proper way to add a column form one table to another [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 2 years ago.
I have two data frames here
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'id':[1,2,3,2,5], 'grade':[3,5,3,2,1]})
df2 = pd.DataFrame({'id':[1,2,3], 'final':[6,4,2]})
Now I want to take final column from df2 and add to df1 based on the id column. Here is the desired output
output = pd.DataFrame({'id':[1,2,3,2,5],'grade':[3,5,3,2,1], 'final':[6,4,2,4,np.nan]})
What approach can I try?
One way to do it is by using map
df1['final'] = df1['id'].map(df2.set_index('id')['final'])
#result
id grade final
0 1 3 6.0
1 2 5 4.0
2 3 3 2.0
3 2 2 4.0
4 5 1 NaN

How to create dataframe from 2 dataframe if value exist in both dataframe [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
How to filter Pandas dataframe using 'in' and 'not in' like in SQL
(11 answers)
Closed 2 years ago.
I have 2 Pandas Dataframes with one column (ID).
the first one look like this:
ID
1
2
3
4
5
and the second one look like this:
ID
3
4
5
6
7
I want to make a new Dataframe by combining those 2 Dataframes, but only the value that exist on both Dataframe.
This is the result that I want:
ID
3
4
5
can you show me how to do this in the most efficient way with pandas? Thank you

Merge Disjoint Columns in Pandas [duplicate]

This question already has answers here:
How to remove nan value while combining two column in Panda Data frame?
(5 answers)
Closed 4 years ago.
I have a pretty simple Pandas question that deals with merging two series. I have two series in a dataframe together that are similar to this:
Column1 Column2
0 Abc NaN
1 NaN Abc
2 Abc NaN
3 NaN Abc
4 NaN Abc
The answer will probably end up being a really simple .merge() or .concat() command, but I'm trying to get a result like this:
Column1
0 Abc
1 Abc
2 Abc
3 Abc
4 Abc
The idea is that for each row, there is a string of data in either Column1, Column2, but never both. I did about 10 minutes of looking for answers on StackOverflow as well as Google, but I couldn't find a similar question that cleanly applied to what I was looking to do.
I realize that a lot of this question just stems from my ignorance on the three functions that Pandas has to stick series and dataframes together. Any help is very much appreciated. Thank you!
You can just use pd.Series.fillna:
df['Column1'] = df['Column1'].fillna(df['Column2'])
Merge or concat are not appropriate here; they are used primarily for combining dataframes or series based on labels.
Use groupby with first
df.groupby(df.columns.str[:-1],axis=1).first()
Out[294]:
Column
0 Abc
1 Abc
2 Abc
3 Abc
4 Abc
Or :
`ndf = pd.DataFrame({'Column1':df.fillna('').sum(1)})`

Python: how to merge two dataframes with different size? [duplicate]

This question already has answers here:
Merge two dataframes by index
(7 answers)
pandas: merge (join) two data frames on multiple columns
(6 answers)
Pandas Merging 101
(8 answers)
Closed 4 years ago.
I have two dataframes, df and df1. The first one contains all the information about all the possible combination of a dataset while the second one is just a subset without the information.
df
x y distance
0 1 4
0 2 3
0 3 2
1 2 2
1 3 5
2 3 1
df1
x y
1 3
2 3
2 3
I would like to merge df and df1 in order to have the following:
df1
x y distance
1 3 5
2 3 1
2 3 1
You can use the merge command
df.merge(df1, left_on=['x','y'], right_on=['x','y'], how='right')
Here you're merging the df on the left with df1 on the right using the columns x andy as merging criteria and keeping only the rows that are present in the right dataframe.
You can read more about merging and joining dataframes here.

Create Range Column with duplicate values pandas [duplicate]

This question already has answers here:
Pandas DENSE RANK
(4 answers)
pandas group by and assign a group id then ungroup
(3 answers)
Closed 5 years ago.
I have a pandas dataframe with a column, call it range_id, that looks something like this:
range_id
1
1
2
2
5
5
5
8
8
10
10
...
I want to maintain the number buckets (each rows that share values still share values), but make the numbers ascend uniformly. So the new column would like this:
range_id
1
1
2
2
3
3
3
4
4
5
5
...
I could write a lambda function that maps these in such a way to achieve this desired output, but I was wondering if pandas has any sort of built-in functionality to achieve this, as it has always surprised me before in what it is capable of doing. Thanks for the help!

Categories