How to rename columns by position? - python

from pandas import DataFrame
df = DataFrame({
'A' : [1, 2, 3]
, 'B' : [3, 2, 1]
})
print(df.rename(columns={'A': 'a', 'B': 'b'}))
I know that I can rename columns like the above.
But sometimes, I just what to rename columns by position so that I don't have to specify the old names.
df.rename(columns=['a', 'b']))
df.rename(columns={0: 'a', 1: 'b'})
I tried the above. But none of them work. Could anybody show me a concise way to rename without specifying original names?
I looking for a way with minimal code. Ideal, this should and could have been supported by the current function interface of rename(). Using a for-loop or something to create a dictionary with the old column names could be a solution but not preferred.

Here you go:
df.rename(columns={ df.columns[0]: 'a', df.columns[1]: 'b',}, inplace = True)
df
Prints:
a b
0 1 3
1 2 2
2 3 1

You might assign directly to pandas.DataFrame.columns i.e.
import pandas as pd
df = pd.DataFrame({'A':[1, 2, 3],'B':[3, 2, 1]})
df.columns = ["X","Y"]
print(df)
output
X Y
0 1 3
1 2 2
2 3 1

You can create a function to rename them:
names = iter(['a', 'b'])
def renamer(col):
return next(names)
df.rename(renamer, axis='columns', inplace=True)
The advantage of this approach is that it is enough that you know the order of the columns to rename and renamer function does not even have to use its parameter.

With this method, you can rename whatever columns by position. It converts the index position (0, 1) into the column name and create the original mapping ({'A': 'a', 'B': 'b'})
c = {0: 'a', 1: 'b'}
m = dict(zip(df.columns[list(c.keys())], c.values()))
>>> m
{'A': 'a', 'B': 'b'}
>>> df.rename(columns=m)
a b
0 1 3
1 2 2
2 3 1

Unclear if the other solutions work in the presence of two or more columns with the same name. Here's a function which does:
# rename df columns by position (as opposed to index)
# mapper is a dict where keys = ordinal column position and vals = new titles
# unclear that using the native df rename() function produces the correct results when renaming by position
def rename_df_cols_by_position(df, mapper):
new_cols = [df.columns[i] if i not in mapper.keys() else mapper[i] for i in range(0, len(df.columns))]
df.columns = new_cols
return

Related

How to get the frequency of column depending on certain values of another column [duplicate]

I am using .size() on a groupby result in order to count how many items are in each group.
I would like the result to be saved to a new column name without manually editing the column names array, how can it be done?
This is what I have tried:
grpd = df.groupby(['A','B'])
grpd['size'] = grpd.size()
grpd
and the error I got:
TypeError: 'DataFrameGroupBy' object does not support item assignment
(on the second line)
The .size() built-in method of DataFrameGroupBy objects actually returns a Series object with the group sizes and not a DataFrame. If you want a DataFrame whose column is the group sizes, indexed by the groups, with a custom name, you can use the .to_frame() method and use the desired column name as its argument.
grpd = df.groupby(['A','B']).size().to_frame('size')
If you wanted the groups to be columns again you could add a .reset_index() at the end.
You need transform size - len of df is same as before:
Notice:
Here it is necessary to add one column after groupby, else you get an error. Because GroupBy.size count NaNs too, what column is used is not important. All columns working same.
import pandas as pd
df = pd.DataFrame({'A': ['x', 'x', 'x','y','y']
, 'B': ['a', 'c', 'c','b','b']})
print (df)
A B
0 x a
1 x c
2 x c
3 y b
4 y b
df['size'] = df.groupby(['A', 'B'])['A'].transform('size')
print (df)
A B size
0 x a 1
1 x c 2
2 x c 2
3 y b 2
4 y b 2
If need set column name in aggregating df - len of df is obviously NOT same as before:
import pandas as pd
df = pd.DataFrame({'A': ['x', 'x', 'x','y','y']
, 'B': ['a', 'c', 'c','b','b']})
print (df)
A B
0 x a
1 x c
2 x c
3 y b
4 y b
df = df.groupby(['A', 'B']).size().reset_index(name='Size')
print (df)
A B Size
0 x a 1
1 x c 2
2 y b 2
The result of df.groupby(...) is not a DataFrame. To get a DataFrame back, you have to apply a function to each group, transform each element of a group, or filter the groups.
It seems like you want a DataFrame that contains (1) all your original data in df and (2) the count of how much data is in each group. These things have different lengths, so if they need to go into the same DataFrame, you'll need to list the size redundantly, i.e., for each row in each group.
df['size'] = df.groupby(['A','B']).transform(np.size)
(Aside: It's helpful if you can show succinct sample input and expected results.)
You can set the as_index parameter in groupby to False to get a DataFrame instead of a Series:
df = pd.DataFrame({'A': ['a', 'a', 'b', 'b'], 'B': [1, 2, 2, 2]})
df.groupby(['A', 'B'], as_index=False).size()
Output:
A B size
0 a 1 1
1 a 2 1
2 b 2 2
lets say n is the name of dataframe and cst is the no of items being repeted.
Below code gives the count in next column
cstn=Counter(n.cst)
cstlist = pd.DataFrame.from_dict(cstn, orient='index').reset_index()
cstlist.columns=['name','cnt']
n['cnt']=n['cst'].map(cstlist.loc[:, ['name','cnt']].set_index('name').iloc[:,0].to_dict())
Hope this will work

Calculating how many values are in a column per each index [duplicate]

I am using .size() on a groupby result in order to count how many items are in each group.
I would like the result to be saved to a new column name without manually editing the column names array, how can it be done?
This is what I have tried:
grpd = df.groupby(['A','B'])
grpd['size'] = grpd.size()
grpd
and the error I got:
TypeError: 'DataFrameGroupBy' object does not support item assignment
(on the second line)
The .size() built-in method of DataFrameGroupBy objects actually returns a Series object with the group sizes and not a DataFrame. If you want a DataFrame whose column is the group sizes, indexed by the groups, with a custom name, you can use the .to_frame() method and use the desired column name as its argument.
grpd = df.groupby(['A','B']).size().to_frame('size')
If you wanted the groups to be columns again you could add a .reset_index() at the end.
You need transform size - len of df is same as before:
Notice:
Here it is necessary to add one column after groupby, else you get an error. Because GroupBy.size count NaNs too, what column is used is not important. All columns working same.
import pandas as pd
df = pd.DataFrame({'A': ['x', 'x', 'x','y','y']
, 'B': ['a', 'c', 'c','b','b']})
print (df)
A B
0 x a
1 x c
2 x c
3 y b
4 y b
df['size'] = df.groupby(['A', 'B'])['A'].transform('size')
print (df)
A B size
0 x a 1
1 x c 2
2 x c 2
3 y b 2
4 y b 2
If need set column name in aggregating df - len of df is obviously NOT same as before:
import pandas as pd
df = pd.DataFrame({'A': ['x', 'x', 'x','y','y']
, 'B': ['a', 'c', 'c','b','b']})
print (df)
A B
0 x a
1 x c
2 x c
3 y b
4 y b
df = df.groupby(['A', 'B']).size().reset_index(name='Size')
print (df)
A B Size
0 x a 1
1 x c 2
2 y b 2
The result of df.groupby(...) is not a DataFrame. To get a DataFrame back, you have to apply a function to each group, transform each element of a group, or filter the groups.
It seems like you want a DataFrame that contains (1) all your original data in df and (2) the count of how much data is in each group. These things have different lengths, so if they need to go into the same DataFrame, you'll need to list the size redundantly, i.e., for each row in each group.
df['size'] = df.groupby(['A','B']).transform(np.size)
(Aside: It's helpful if you can show succinct sample input and expected results.)
You can set the as_index parameter in groupby to False to get a DataFrame instead of a Series:
df = pd.DataFrame({'A': ['a', 'a', 'b', 'b'], 'B': [1, 2, 2, 2]})
df.groupby(['A', 'B'], as_index=False).size()
Output:
A B size
0 a 1 1
1 a 2 1
2 b 2 2
lets say n is the name of dataframe and cst is the no of items being repeted.
Below code gives the count in next column
cstn=Counter(n.cst)
cstlist = pd.DataFrame.from_dict(cstn, orient='index').reset_index()
cstlist.columns=['name','cnt']
n['cnt']=n['cst'].map(cstlist.loc[:, ['name','cnt']].set_index('name').iloc[:,0].to_dict())
Hope this will work

How to drop rows based on column value if column is not set as index in pandas?

I have a list and a dataframe which look like this:
list = ['a', 'b']
df = pd.DataFrame({'A':['a', 'b', 'c', 'd'], 'B':[9, 9, 8, 4]})
I would like to do something like this:
df1 = df.drop([x for x in list])
I am getting the following error message:
"KeyError: "['a' 'b'] not found in axis""
I know I can do the following:
df = pd.DataFrame({'A':['a', 'b', 'c', 'd'], 'B':[9, 9, 8, 4]}).set_index('A')
df1 = df.drop([x for x in list])
How can I drop the list values without having to set column 'A' as index? My dataframe has multiple columns.
Input:
A B
0 a 9
1 b 9
2 c 8
3 d 4
Code:
for i in list:
ind = df[df['A']==i].index.tolist()
df=df.drop(ind)
df
Output:
A B
2 c 8
3 d 4
You need to specify the correct axis, namely axis=1.
According to the docs:
axis{0 or ‘index’, 1 or ‘columns’}, default 0. Whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’).
You also need to make sure you are using the correct column names, (ie. uppercase rather than lower case. And you should avoid using python reserved names, so don't use list.
This should work:
myList = ['A', 'B']
df1 = df.drop([x for x in myList], axis=1)

Add row counts to entire dataframe [duplicate]

I am using .size() on a groupby result in order to count how many items are in each group.
I would like the result to be saved to a new column name without manually editing the column names array, how can it be done?
This is what I have tried:
grpd = df.groupby(['A','B'])
grpd['size'] = grpd.size()
grpd
and the error I got:
TypeError: 'DataFrameGroupBy' object does not support item assignment
(on the second line)
The .size() built-in method of DataFrameGroupBy objects actually returns a Series object with the group sizes and not a DataFrame. If you want a DataFrame whose column is the group sizes, indexed by the groups, with a custom name, you can use the .to_frame() method and use the desired column name as its argument.
grpd = df.groupby(['A','B']).size().to_frame('size')
If you wanted the groups to be columns again you could add a .reset_index() at the end.
You need transform size - len of df is same as before:
Notice:
Here it is necessary to add one column after groupby, else you get an error. Because GroupBy.size count NaNs too, what column is used is not important. All columns working same.
import pandas as pd
df = pd.DataFrame({'A': ['x', 'x', 'x','y','y']
, 'B': ['a', 'c', 'c','b','b']})
print (df)
A B
0 x a
1 x c
2 x c
3 y b
4 y b
df['size'] = df.groupby(['A', 'B'])['A'].transform('size')
print (df)
A B size
0 x a 1
1 x c 2
2 x c 2
3 y b 2
4 y b 2
If need set column name in aggregating df - len of df is obviously NOT same as before:
import pandas as pd
df = pd.DataFrame({'A': ['x', 'x', 'x','y','y']
, 'B': ['a', 'c', 'c','b','b']})
print (df)
A B
0 x a
1 x c
2 x c
3 y b
4 y b
df = df.groupby(['A', 'B']).size().reset_index(name='Size')
print (df)
A B Size
0 x a 1
1 x c 2
2 y b 2
The result of df.groupby(...) is not a DataFrame. To get a DataFrame back, you have to apply a function to each group, transform each element of a group, or filter the groups.
It seems like you want a DataFrame that contains (1) all your original data in df and (2) the count of how much data is in each group. These things have different lengths, so if they need to go into the same DataFrame, you'll need to list the size redundantly, i.e., for each row in each group.
df['size'] = df.groupby(['A','B']).transform(np.size)
(Aside: It's helpful if you can show succinct sample input and expected results.)
You can set the as_index parameter in groupby to False to get a DataFrame instead of a Series:
df = pd.DataFrame({'A': ['a', 'a', 'b', 'b'], 'B': [1, 2, 2, 2]})
df.groupby(['A', 'B'], as_index=False).size()
Output:
A B size
0 a 1 1
1 a 2 1
2 b 2 2
lets say n is the name of dataframe and cst is the no of items being repeted.
Below code gives the count in next column
cstn=Counter(n.cst)
cstlist = pd.DataFrame.from_dict(cstn, orient='index').reset_index()
cstlist.columns=['name','cnt']
n['cnt']=n['cst'].map(cstlist.loc[:, ['name','cnt']].set_index('name').iloc[:,0].to_dict())
Hope this will work

How to assign a name to the size() column?

I am using .size() on a groupby result in order to count how many items are in each group.
I would like the result to be saved to a new column name without manually editing the column names array, how can it be done?
This is what I have tried:
grpd = df.groupby(['A','B'])
grpd['size'] = grpd.size()
grpd
and the error I got:
TypeError: 'DataFrameGroupBy' object does not support item assignment
(on the second line)
The .size() built-in method of DataFrameGroupBy objects actually returns a Series object with the group sizes and not a DataFrame. If you want a DataFrame whose column is the group sizes, indexed by the groups, with a custom name, you can use the .to_frame() method and use the desired column name as its argument.
grpd = df.groupby(['A','B']).size().to_frame('size')
If you wanted the groups to be columns again you could add a .reset_index() at the end.
You need transform size - len of df is same as before:
Notice:
Here it is necessary to add one column after groupby, else you get an error. Because GroupBy.size count NaNs too, what column is used is not important. All columns working same.
import pandas as pd
df = pd.DataFrame({'A': ['x', 'x', 'x','y','y']
, 'B': ['a', 'c', 'c','b','b']})
print (df)
A B
0 x a
1 x c
2 x c
3 y b
4 y b
df['size'] = df.groupby(['A', 'B'])['A'].transform('size')
print (df)
A B size
0 x a 1
1 x c 2
2 x c 2
3 y b 2
4 y b 2
If need set column name in aggregating df - len of df is obviously NOT same as before:
import pandas as pd
df = pd.DataFrame({'A': ['x', 'x', 'x','y','y']
, 'B': ['a', 'c', 'c','b','b']})
print (df)
A B
0 x a
1 x c
2 x c
3 y b
4 y b
df = df.groupby(['A', 'B']).size().reset_index(name='Size')
print (df)
A B Size
0 x a 1
1 x c 2
2 y b 2
The result of df.groupby(...) is not a DataFrame. To get a DataFrame back, you have to apply a function to each group, transform each element of a group, or filter the groups.
It seems like you want a DataFrame that contains (1) all your original data in df and (2) the count of how much data is in each group. These things have different lengths, so if they need to go into the same DataFrame, you'll need to list the size redundantly, i.e., for each row in each group.
df['size'] = df.groupby(['A','B']).transform(np.size)
(Aside: It's helpful if you can show succinct sample input and expected results.)
You can set the as_index parameter in groupby to False to get a DataFrame instead of a Series:
df = pd.DataFrame({'A': ['a', 'a', 'b', 'b'], 'B': [1, 2, 2, 2]})
df.groupby(['A', 'B'], as_index=False).size()
Output:
A B size
0 a 1 1
1 a 2 1
2 b 2 2
lets say n is the name of dataframe and cst is the no of items being repeted.
Below code gives the count in next column
cstn=Counter(n.cst)
cstlist = pd.DataFrame.from_dict(cstn, orient='index').reset_index()
cstlist.columns=['name','cnt']
n['cnt']=n['cst'].map(cstlist.loc[:, ['name','cnt']].set_index('name').iloc[:,0].to_dict())
Hope this will work

Categories