Display column names based on 0, 1 in python [duplicate] - python

This question already has answers here:
Column name corresponding to largest value in pandas DataFrame [duplicate]
(2 answers)
Most efficient way to return Column name in a pandas df
(1 answer)
Closed 4 years ago.
I am trying to make the following transformation in python
from
A B C D E
x 1 0 0 0
y 0 1 1 0
z 1 0 0 1
to
x B
y C,D
z A,E
Do you have any ideas ?

Check with dot
df=df.set_index('A')
df.dot(df.columns+',').str[:-1]
A
x B
y C,D
z B,E
dtype: object

Related

convert row data to column data based on a condition - pandas [duplicate]

This question already has answers here:
How do I melt a pandas dataframe?
(3 answers)
Closed 2 months ago.
I've a sample dataframe
s_id c1_id c2_id c3_id
1 a b c
2 a b
3 x y z
how can I transpose the dataframe to
s_id c_id
1 a
1 b
1 c
2 a
2 b
3 x
3 y
3 z
Here you go
df.set_index("s_id").stack().droplevel(1)
Result:
s_id
1 a
1 b
1 c
2 a
2 b
3 x
3 y
3 z
dtype: object
Explanation:
Set s_id as index
Apply stack so every column is stack on each other.
We remove names stacked columns, because we don't need them.

python pandas - creating a column after matching keys with another data frame [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 3 years ago.
I have two data frames. for the sake of simpleness, I will provide two dummy data frames here.
A = pd.DataFrame({'id':[1,2,3], 'name':['a','b','c']})
B = pd.DataFrame({'id':[1,1,1,3,2,3,1]})
Now, I want to create a column on the data frame B with the names that match the ids.
In this case, my desire output will be:
B = pd.DataFrame({'id':[1,1,1,3,2,3,1], 'name':['a','a','a','c','b','c','a'})
I was trying to use .apply and lambda or try to come up with other ideas, but I could not make it work.
Thank you for your help.
pd.merge or .map we use your id column as the key and return all matching values on your target dataframe.
df = pd.merge(B,A,on='id',how='left')
#or
B['name'] = B['id'].map(A.set_index('id')['name'])
print(df)
id name
0 1 a
1 1 a
2 1 a
3 3 c
4 2 b
5 3 c
6 1 a
print(B)
id name
0 1 a
1 1 a
2 1 a
3 3 c
4 2 b
5 3 c
6 1 a

Calculate number of value repetition inside column and move calculation into matrix PANDAS [duplicate]

This question already has answers here:
How can I pivot a dataframe?
(5 answers)
Closed 3 years ago.
I have dataframe which looks as this:
FIRST SECOND
1 a
1 b
1 c
1 b
2 a
2 k
3 r
3 r
3 r
And I need to get matrix as this, which represent count of repetition of each word for every number:
FIRST a b c k r
1 1 2 1 0 0
2 1 0 0 1 0
3 0 0 0 0 3
Can anyone help me with this? :)
This works:
pd.concat([df.FIRST, pd.get_dummies(df.SECOND)],1).groupby('FIRST').sum()
Use pivot_table with aggfunc='count'
pd.pivot_table(df, values = 'SECOND',
columns = df['SECOND'],
index = df['FIRST'],
aggfunc ='count',
fill_value = 0)
Outputs
SECOND a b c k r
FIRST
1 1 2 1 0 0
2 1 0 0 1 0
3 0 0 0 0 3

how to split column and show all the columns part of data in pandas? [duplicate]

This question already has answers here:
How to widen a dataframe - pandas
(2 answers)
Closed 4 years ago.
I would like to split the column name domain to domain_num, domain_alp
domain count year
1 [1, A] 0 1972.0
2 [1, B] 0 1972.0
3 [1, C] 0 1972.0
and show all the columns like this. here is the sample of my expected results.
domain_num domain_alp count year
1 1 A 0 1972.0
2 1 B 0 1972.0
3 1 C 0 1972.0
but when I tried this code to split the column.
new_df = pd.DataFrame(df['domain'].values.tolist(), columns=['domain_num','domain_alp'])
new_df
the result is being like this. it does not show all columns as I expected.
domain_num domain_alp
1 A
1 B
1 C
are any suggestions for doing this?
Using concat the result together
new_df = pd.DataFrame(df['domain'].values.tolist(), columns=['domain_num','domain_alp'],index=df.index)
new_df=pd.concat([new_df ,df[['count','year']]],axis=1)
Another way is to assign the columns to your original df:
df[['domain_num', 'domain_alp']] = pd.DataFrame(df.domain.tolist())
df.drop('domain', axis=1, inplace=True)
>>> df
count year domain_num domain_alp
0 0 1972.0 1 A
1 0 1972.0 1 B
2 0 1972.0 1 C

How to take column of dataframe in pandas [duplicate]

This question already has answers here:
How do I select rows from a DataFrame based on column values?
(16 answers)
Closed 5 years ago.
now I have this dataframe:
A B C
0 m 1 b
1 n 4 a
2 p 3 c
3 o 4 d
4 k 6 e
so,How I can get n,p,k in column。as follow:
A B C
0 n 4 a
1 p 3 c
2 k 6 e
thanks
Use .loc
df = df.loc[df.A.isin(['n','p','k']),:]

Categories