How to aggregate duplicate rows in python? [duplicate] - python

This question already has answers here:
Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
(9 answers)
How to count duplicate rows in pandas dataframe?
(10 answers)
Closed 2 years ago.
I have a dataframe that looks like this:
Cell1 Cell2 Cell3
A B B
A B B
B B A
C B A
I am trying to get the following output:
Cell1 Cell2 Cell3 sum
A B B 2
B B A 1
C B A 1
I tried the aggregate function, but can't find the solution for this.

Related

remove all the values from the column and make it blank [duplicate]

This question already has answers here:
How to add an empty column to a dataframe?
(15 answers)
Closed 2 years ago.
I have a dataframe :
a b
1 dasd
2 fsfr12341
3 %%$dasd11
4 &^hkyo1
I need to remove all the values in column b and make it a blank column
a b
1
2
3
4
Kindly help me on this.
thanks alot
Try changing the b column to empty strings '', like this:
df['b'] = ''

How to create dataframe from 2 dataframe if value exist in both dataframe [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
How to filter Pandas dataframe using 'in' and 'not in' like in SQL
(11 answers)
Closed 2 years ago.
I have 2 Pandas Dataframes with one column (ID).
the first one look like this:
ID
1
2
3
4
5
and the second one look like this:
ID
3
4
5
6
7
I want to make a new Dataframe by combining those 2 Dataframes, but only the value that exist on both Dataframe.
This is the result that I want:
ID
3
4
5
can you show me how to do this in the most efficient way with pandas? Thank you

How to count all values of a one column? [duplicate]

This question already has answers here:
Count the frequency that a value occurs in a dataframe column
(15 answers)
Closed 3 years ago.
I am trying to count all the instances of all values of col_a
for ex.
col_a
A
B
C
A
D
B
A
Is there one line of code I can use that would tell me how many times each value (A,B,C,D) exist in that column?
So the solution will be value_counts
df.col_a.value_counts()
Or use groupby with size:
>>> df.groupby('col_a').size()
col_a
A 3
B 2
C 1
D 1
dtype: int64
>>>

Subtracting and getting values which are not in a column [duplicate]

This question already has answers here:
How do I select rows from a DataFrame based on column values?
(16 answers)
Closed 3 years ago.
I have a dataframe one
query
----------
A
B
C
D
E
dataframe two
query
---------
A
B
C
I want the output as
query
------------
D
E
as I want the values in one that are not in two
I have tried to convert it into lists and subtracting the values but that does not work.
Try:
df1[~df1['query'].isin(df2['query'])]

translate dataframe with dictionary in python [duplicate]

This question already has answers here:
Remap values in pandas column with a dict, preserve NaNs
(11 answers)
Closed 5 years ago.
Having the following pandas Dataframe sample:
df = pd.DataFrame([[1,2],[1,2],[3,5]])
df
0 1
0 1 2
1 1 2
2 3 5
And the following dictionary:
d = {1:'foo',2:'bar',3:'tar',4:'tartar',5:'foofoo'}
I would like to "translate" the dataframe by using the dictionary d. The output looks like:
result = pd.DataFrame([['foo','bar'],['foo','bar'],['tar','fofo']])
result
0 1
0 foo bar
1 foo bar
2 tar fofo
I would like to avoid using for loops. The solution I'm trying to find is something with map or similars...
Solution
Replacing whole dataframe:
result_1 = df.replace(d)
Replacing a specific column of a dataframe:
result_2 = df.replace({"COLUMN":d})

Categories