Mapping column values with index of List - python

I have below dataframe :
Df1
Col1
3
5
6
7
9
and I have a below List
Mapping_list= ["Sales","Pre-Sales","Marketing", "Digital-Banking", "Payments", "telecom", "Core-Banking","Infra", "Cards", "Commercial-Banking" ]
I want to map column values with the index of the list like below:
Col1 Values
3 Digital-Banking
5 telecom
6 Core-Banking
7 Infra
9 Commercial-Banking
I could have done this if instead of list i need to map it with another dataframe index, but with list i am facing issue.

You can convert your list to series and map it, as map takes a series:
df['Values'] = df.Col1.map(pd.Series(Mapping_list))
Prints:
Col1 Values
0 3 Digital-Banking
1 5 telecom
2 6 Core-Banking
3 7 Infra
4 9 Commercial-Banking

Related

Appending rows to existing pandas dataframe

I have a pandas dataframe df1
a b
0 1 2
1 3 4
I have another dataframe in the form of a dictionary
dictionary = {'2' : [5, 6], '3' : [7, 8]}
I want to append the dictionary values as rows in dataframe df1. I am using pandas.DataFrame.from_dict() to convert the dictionary into dataframe. The constraint is, when I do it, I cannot provide any value to the 'column' argument for the method from_dict().
So, when I try to concatenate the two dataframes, the pandas adds the contents of the new dataframe as new columns. I do not want that. The final output I want is in the format
a b
0 1 2
1 3 4
2 5 6
3 7 8
Can someone tell me how do I do this in least painful way?
Use concat with help of pd.DataFrame.from_dict, setting the columns of df1 during the conversion:
out = pd.concat([df1,
pd.DataFrame.from_dict(dictionary, orient='index',
columns=df1.columns)
])
Output:
a b
0 1 2
1 3 4
2 5 6
3 7 8
Another possible solution, which uses numpy.vstack:
pd.DataFrame(np.vstack([df.values, np.array(
list(dictionary.values()))]), columns=df.columns)
Output:
a b
0 1 2
1 3 4
2 5 6
3 7 8

Pandas merge multiple value columns into a value and type column

I have a pandas dataframe where there are multiple integer value columns denoting a count. I want to transform this dataframe such that the value columns are merged into one column but another column is created denoting the column the value was taken from.
Input
a b c
0 2 5 8
1 3 6 9
2 4 7 10
Output
count type
0 2 a
1 3 a
2 4 a
3 5 b
4 6 b
5 7 b
6 8 c
7 9 c
8 10 c
Im sure this is possible by looping over the entries and creating however many rows for each original row but im sure there is a pandas way to achieve this and I would like to know what it is called.
You could do that with the following
pd.melt(df, value_vars=['a','b','c'], value_name='count', var_name='type')

Removing columns in Pandas

I work on a big Python dataframe and notice that some columns have same values for each row BUT columns' names are different.
Also, some values are text, or timeseries data.
Any easy was to get rid of these columns duplicates and keep first each time?
Many thanks
Let create a dummy data frame, where two columns with different names are duplicate.
import pandas as pd
df=pd.DataFrame({
'col1':[1,2,3,'b',5,6],
'col2':[11,'a',13,14,15,16],
'col3':[1,2,3,'b',5,6],
})
col1 col2 col3
0 1 11 1
1 2 a 2
2 3 13 3
3 b 14 b
4 5 15 5
5 6 16 6
To remove duplicate columns, first, take transpose, then apply drop_duplicate and again take transpose
df.T.drop_duplicates().T
result
col1 col2
0 1 11
1 2 a
2 3 13
3 b 14
4 5 15
5 6 16

Split/extract strings in Pandas series index and expand as DataFrame

I have a Pandas series as below:
index Value
'4-5-a' 2
'6-7-d' 3
'9-6-c' 7
'5-3-k' 8
I would like to extract/split the index of the series and form a DataFrame as shown below:
index Value x y
'4-5-a' 2 4 5
'6-7-d' 3 6 7
'9-6-c' 7 9 6
'5-3-k' 8 5 3
What is the best way to do this?
This is one way.
# convert series to dataframe, elevate index to column
df = s.to_frame('Value').reset_index()
# split by dash and exclude final split
df[['x', 'y']] = df['index'].str.split('-', expand=True).iloc[:, :-1].astype(int)
print(df)
index Value x y
0 4-5-a 2 4 5
1 6-7-d 3 6 7
2 9-6-c 7 9 6
3 5-3-k 8 5 3

Python pandas: Append rows of DataFrame and delete the appended rows

import pandas as pd
df = pd.DataFrame({
'id':[1,2,3,4,5,6,7,8,9,10,11],
'text': ['abc','zxc','qwe','asf','efe','ert','poi','wer','eer','poy','wqr']})
I have a DataFrame with columns:
id text
1 abc
2 zxc
3 qwe
4 asf
5 efe
6 ert
7 poi
8 wer
9 eer
10 poy
11 wqr
I have a list L = [1,3,6,10] which contains list of id's.
I am trying to append the text column using a list such that, from my list first taking 1 and 3(first two values in a list) and appending text column in my DataFrame with id = 1 which has id's 2, then deleting rows with id column 2 similarly then taking 3 and 6 and then appending text column where id = 4,5 to id 3 and then delete rows with id = 4 and 5 and iteratively for elements in list (x, x+1)
My final output would look like this:
id text
1 abczxc # joining id 1 and 2
3 qweasfefe # joining id 3,4 and 5
6 ertpoiwereer # joining id 6,7,8,9
10 poywqr # joining id 10 and 11
You can use isin with cumsum for Series, which is use for groupby with apply join function:
s = df.id.where(df.id.isin(L)).ffill().astype(int)
df1 = df.groupby(s)['text'].apply(''.join).reset_index()
print (df1)
id text
0 1 abczxc
1 3 qweasfefe
2 6 ertpoiwereer
3 10 poywqr
It working because:
s = df.id.where(df.id.isin(L)).ffill().astype(int)
print (s)
0 1
1 1
2 3
3 3
4 3
5 6
6 6
7 6
8 6
9 10
10 10
Name: id, dtype: int32
I changed the values not in list to np.nan and then ffill and groupby. Though #Jezrael's approach is much better. I need to remember to use cumsum:)
l = [1,3,6,10]
df.id[~df.id.isin(l)] = np.nan
df = df.ffill().groupby('id').sum()
text
id
1.0 abczxc
3.0 qweasfefe
6.0 ertpoiwereer
10.0 poywqr
Use pd.cut to create you bins then groupby with a lambda function to join your text in that group.
df.groupby(pd.cut(df.id,L+[np.inf],right=False, labels=[i for i in L])).apply(lambda x: ''.join(x.text))
EDIT:
(df.groupby(pd.cut(df.id,L+[np.inf],
right=False,
labels=[i for i in L]))
.apply(lambda x: ''.join(x.text)).reset_index().rename(columns={0:'text'}))
Output:
id text
0 1 abczxc
1 3 qweasfefe
2 6 ertpoiwereer
3 10 poywqr

Categories