This question already has answers here:
Column name corresponding to largest value in pandas DataFrame [duplicate]
(2 answers)
Most efficient way to return Column name in a pandas df
(1 answer)
Closed 4 years ago.
I am trying to make the following transformation in python
from
A B C D E
x 1 0 0 0
y 0 1 1 0
z 1 0 0 1
to
x B
y C,D
z A,E
Do you have any ideas ?
Check with dot
df=df.set_index('A')
df.dot(df.columns+',').str[:-1]
A
x B
y C,D
z B,E
dtype: object
Related
This question already has answers here:
How do I melt a pandas dataframe?
(3 answers)
Closed 2 months ago.
I've a sample dataframe
s_id c1_id c2_id c3_id
1 a b c
2 a b
3 x y z
how can I transpose the dataframe to
s_id c_id
1 a
1 b
1 c
2 a
2 b
3 x
3 y
3 z
Here you go
df.set_index("s_id").stack().droplevel(1)
Result:
s_id
1 a
1 b
1 c
2 a
2 b
3 x
3 y
3 z
dtype: object
Explanation:
Set s_id as index
Apply stack so every column is stack on each other.
We remove names stacked columns, because we don't need them.
This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 3 years ago.
I have two data frames. for the sake of simpleness, I will provide two dummy data frames here.
A = pd.DataFrame({'id':[1,2,3], 'name':['a','b','c']})
B = pd.DataFrame({'id':[1,1,1,3,2,3,1]})
Now, I want to create a column on the data frame B with the names that match the ids.
In this case, my desire output will be:
B = pd.DataFrame({'id':[1,1,1,3,2,3,1], 'name':['a','a','a','c','b','c','a'})
I was trying to use .apply and lambda or try to come up with other ideas, but I could not make it work.
Thank you for your help.
pd.merge or .map we use your id column as the key and return all matching values on your target dataframe.
df = pd.merge(B,A,on='id',how='left')
#or
B['name'] = B['id'].map(A.set_index('id')['name'])
print(df)
id name
0 1 a
1 1 a
2 1 a
3 3 c
4 2 b
5 3 c
6 1 a
print(B)
id name
0 1 a
1 1 a
2 1 a
3 3 c
4 2 b
5 3 c
6 1 a
This question already has answers here:
How can I pivot a dataframe?
(5 answers)
Closed 3 years ago.
I have dataframe which looks as this:
FIRST SECOND
1 a
1 b
1 c
1 b
2 a
2 k
3 r
3 r
3 r
And I need to get matrix as this, which represent count of repetition of each word for every number:
FIRST a b c k r
1 1 2 1 0 0
2 1 0 0 1 0
3 0 0 0 0 3
Can anyone help me with this? :)
This works:
pd.concat([df.FIRST, pd.get_dummies(df.SECOND)],1).groupby('FIRST').sum()
Use pivot_table with aggfunc='count'
pd.pivot_table(df, values = 'SECOND',
columns = df['SECOND'],
index = df['FIRST'],
aggfunc ='count',
fill_value = 0)
Outputs
SECOND a b c k r
FIRST
1 1 2 1 0 0
2 1 0 0 1 0
3 0 0 0 0 3
This question already has answers here:
How to widen a dataframe - pandas
(2 answers)
Closed 4 years ago.
I would like to split the column name domain to domain_num, domain_alp
domain count year
1 [1, A] 0 1972.0
2 [1, B] 0 1972.0
3 [1, C] 0 1972.0
and show all the columns like this. here is the sample of my expected results.
domain_num domain_alp count year
1 1 A 0 1972.0
2 1 B 0 1972.0
3 1 C 0 1972.0
but when I tried this code to split the column.
new_df = pd.DataFrame(df['domain'].values.tolist(), columns=['domain_num','domain_alp'])
new_df
the result is being like this. it does not show all columns as I expected.
domain_num domain_alp
1 A
1 B
1 C
are any suggestions for doing this?
Using concat the result together
new_df = pd.DataFrame(df['domain'].values.tolist(), columns=['domain_num','domain_alp'],index=df.index)
new_df=pd.concat([new_df ,df[['count','year']]],axis=1)
Another way is to assign the columns to your original df:
df[['domain_num', 'domain_alp']] = pd.DataFrame(df.domain.tolist())
df.drop('domain', axis=1, inplace=True)
>>> df
count year domain_num domain_alp
0 0 1972.0 1 A
1 0 1972.0 1 B
2 0 1972.0 1 C
This question already has answers here:
How do I select rows from a DataFrame based on column values?
(16 answers)
Closed 5 years ago.
now I have this dataframe:
A B C
0 m 1 b
1 n 4 a
2 p 3 c
3 o 4 d
4 k 6 e
so,How I can get n,p,k in column。as follow:
A B C
0 n 4 a
1 p 3 c
2 k 6 e
thanks
Use .loc
df = df.loc[df.A.isin(['n','p','k']),:]