convert row data to column data based on a condition - pandas [duplicate] - python

This question already has answers here:
How do I melt a pandas dataframe?
(3 answers)
Closed 2 months ago.
I've a sample dataframe
s_id c1_id c2_id c3_id
1 a b c
2 a b
3 x y z
how can I transpose the dataframe to
s_id c_id
1 a
1 b
1 c
2 a
2 b
3 x
3 y
3 z

Here you go
df.set_index("s_id").stack().droplevel(1)
Result:
s_id
1 a
1 b
1 c
2 a
2 b
3 x
3 y
3 z
dtype: object
Explanation:
Set s_id as index
Apply stack so every column is stack on each other.
We remove names stacked columns, because we don't need them.

Related

What is the most efficient way to populate one pandas dataframe using another dataframe? [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 2 years ago.
I am wondering how I can most efficiently do the following operation so that I can also upscale it to dataframes with million rows+.
I have 2 panda dataframes:
Data1:
Position Letter
1 a
2 b
3 c
4 b
5 a
Data2:
Weight Letter
1 a
2 b
3 c
Now I want to create an extra column(weight) in data 1 resulting in the following:
Position Letter Weight
1 a 1
2 b 2
3 c 3
4 b 2
5 a 1
Best way is to use merge:
df = df1.merge(df2, on=['Letter'])
print(df)
Position Letter Weight
0 1 a 1
1 5 a 1
2 2 b 2
3 4 b 2
4 3 c 3

python pandas - creating a column after matching keys with another data frame [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 3 years ago.
I have two data frames. for the sake of simpleness, I will provide two dummy data frames here.
A = pd.DataFrame({'id':[1,2,3], 'name':['a','b','c']})
B = pd.DataFrame({'id':[1,1,1,3,2,3,1]})
Now, I want to create a column on the data frame B with the names that match the ids.
In this case, my desire output will be:
B = pd.DataFrame({'id':[1,1,1,3,2,3,1], 'name':['a','a','a','c','b','c','a'})
I was trying to use .apply and lambda or try to come up with other ideas, but I could not make it work.
Thank you for your help.
pd.merge or .map we use your id column as the key and return all matching values on your target dataframe.
df = pd.merge(B,A,on='id',how='left')
#or
B['name'] = B['id'].map(A.set_index('id')['name'])
print(df)
id name
0 1 a
1 1 a
2 1 a
3 3 c
4 2 b
5 3 c
6 1 a
print(B)
id name
0 1 a
1 1 a
2 1 a
3 3 c
4 2 b
5 3 c
6 1 a

Separate String from and create a dataframe column [duplicate]

This question already has answers here:
Split cell into multiple rows in pandas dataframe
(5 answers)
How to unnest (explode) a column in a pandas DataFrame, into multiple rows
(16 answers)
Closed 3 years ago.
I am working on a below problem :
df_temp = pd.DataFrame()
df_temp.insert(0, 'Label', ["A|B|C","A|C","C|B","A","B"])
df_temp.insert(1, 'ID', [1,2,3,4,5])
df_temp
Label ID
0 A|B|C 1
1 A|C 2
2 C|B 3
3 A 4
4 B 5
I want to convert this dataframe into something like below dataframe, where I can separate Labels for ID column.
Expected Output :
ID Label
1 A
1 B
1 C
2 A
2 C
3 C
3 B
4 A
5 B
Try this:
(df_temp.set_index('ID')['Label']
.str.split('|', expand=True)
.reset_index()
.melt('ID')
.drop('variable', axis=1)
.dropna()
.sort_values('ID'))
Output:
ID value
0 1 A
5 1 B
10 1 C
1 2 A
6 2 C
2 3 C
7 3 B
3 4 A
4 5 B

How to select all but the 3 last columns of a dataframe in Python [duplicate]

This question already has answers here:
Selecting last n columns and excluding last n columns in dataframe
(3 answers)
Closed 4 years ago.
I want to select all but the 3 last columns of my dataframe.
I tried :
df.loc[:,-3]
But it does not work
Edit : title
Select everything EXCEPT the last 3 columns, do this using iloc:
In [1639]: df
Out[1639]:
a b c d e
0 1 3 2 2 2
1 2 4 1 1 1
In [1640]: df.iloc[:,:-3]
Out[1640]:
a b
0 1 3
1 2 4
Use this df.columns being sliced, and putted into a df[...] bracket:
print(df[df.columns[:-3]])

Display column names based on 0, 1 in python [duplicate]

This question already has answers here:
Column name corresponding to largest value in pandas DataFrame [duplicate]
(2 answers)
Most efficient way to return Column name in a pandas df
(1 answer)
Closed 4 years ago.
I am trying to make the following transformation in python
from
A B C D E
x 1 0 0 0
y 0 1 1 0
z 1 0 0 1
to
x B
y C,D
z A,E
Do you have any ideas ?
Check with dot
df=df.set_index('A')
df.dot(df.columns+',').str[:-1]
A
x B
y C,D
z B,E
dtype: object

Categories