how to use map in index of pandas dataframe [duplicate] - python

This question already has answers here:
Map dataframe index using dictionary
(6 answers)
Closed 1 year ago.
I want to create a new column on a pandas dataframe using values on the index and a dictionary that translates these values into something more meaningful. My initial idea was to use map. I arrived to a solution but it is very convoluted and there must be a more elegant way to do it. Suggestions?
#dataframe and dict definition
df=pd.DataFrame({'foo':[1,2,3],'boo':[3,4,5]},index=['a','b','c'])
d={'a':'aa','b':'bb','c':'cc'}
df['new column']=df.reset_index().set_index('index',drop=False)['index'].map(d)

Creating a new series explicitly is a bit shorter:
df['new column'] = pd.Series(df.index, index=df.index).map(d)

After to_series, you can using map or replace
df.index=df.index.to_series().map(d)
df
Out[806]:
boo foo
aa 3 1
bb 4 2
cc 5 3
Or we think about another way
df['New']=pd.Series(d).get(df.index)
df
Out[818]:
boo foo New
a 3 1 aa
b 4 2 bb
c 5 3 cc

Related

Transpose a table using pandas [duplicate]

This question already has answers here:
How can I pivot a dataframe?
(5 answers)
Closed 7 months ago.
I have a dataframe that looks like this:
A
type
val
first
B
20
second
B
30
first
C
200
second
C
300
I need to get it to look like this:
A
B
C
first
20
200
second
30
300
How do I do this using Pandas? I tried using transpose, but couldn't get it to this exact table.
df = df.pivot('A','type')
df.columns = [x[1] for x in list(df.columns)]
df.reset_index()

How to Group by a dataframe with a custom aggregation function AND without SQL?

I currently got this dataframe:
original dataframe
However I would like to obtain a dataframe (not containing the 't') from this which looks like this (considering the index):
The index we want for our original dataframe
This of course is done easily when using .groupby().agg(), but the thing is that I don't got a simple aggregation function such as 'max' or 'mean', that I would like to use. Hence my question is: 'Is it possible to group by a dataframe with a customized aggregation function and without using SQL? If so, please let me know!'
I would love to get some help!
Simplified code example explaining my question:
df_example =
C D E
A B
1 2 5 8 9
3 7 9 3
2 4 9 5 5
6 1 4 5
We would like to obtain:
df_example_groupedby_A_only_aggregating_with_custom_function =
Z_custom
A
1 33
2 34
The values in Z_custom are obtained by using the custom aggregation function which uses the values in columns [C,D,E] from df_example.

How to aggregate rows in a dataframe [duplicate]

This question already has answers here:
How do I Pandas group-by to get sum?
(11 answers)
Closed 2 years ago.
I have a dataframe that contains values by country (and by region in certain countries) and which looks like this:
For each country that is repeated, I would add the values by regions so that there is only one row per country and obtain the following file:
How can I do this in Python? Since I'm really new to Python, I don't mind having a long set of instructions, as long as the procedure is clear, rather than a single line of code, compacted but hard to understand.
Thanks for your help.
You want to study the split-apply-combine paradigm of Pandas DataFrame manipulation. You can do a lot with it. What you want to do is common, and can be accomplished in one line.
>>> import pandas as pd
>>> df = pd.DataFrame({"foo": ["a","b","a","b","c"], "bar": [6,5,4,3,2]})
>>> df
foo bar
0 a 6
1 b 5
2 a 4
3 b 3
4 c 2
>>> df.groupby("foo").sum()
bar
foo
a 10
b 8
c 2

translate dataframe with dictionary in python [duplicate]

This question already has answers here:
Remap values in pandas column with a dict, preserve NaNs
(11 answers)
Closed 5 years ago.
Having the following pandas Dataframe sample:
df = pd.DataFrame([[1,2],[1,2],[3,5]])
df
0 1
0 1 2
1 1 2
2 3 5
And the following dictionary:
d = {1:'foo',2:'bar',3:'tar',4:'tartar',5:'foofoo'}
I would like to "translate" the dataframe by using the dictionary d. The output looks like:
result = pd.DataFrame([['foo','bar'],['foo','bar'],['tar','fofo']])
result
0 1
0 foo bar
1 foo bar
2 tar fofo
I would like to avoid using for loops. The solution I'm trying to find is something with map or similars...
Solution
Replacing whole dataframe:
result_1 = df.replace(d)
Replacing a specific column of a dataframe:
result_2 = df.replace({"COLUMN":d})

Pandas SettingWithCopyWarning When Using loc [duplicate]

This question already has answers here:
How to deal with SettingWithCopyWarning in Pandas
(20 answers)
Closed 2 years ago.
Have a general question on assignments with indexing/slicing using .loc.
Assume the below DataFrame, df:
df:
A B C
0 a b
1 a b
2 b a
3 c c
4 c a
code to reproduce:
df = pd.DataFrame({'A':list('aabcc'), 'B':list('bbaca'), 'C':5*[None]})
I create df1 using:
df1=df.loc[df.A=='c']
df1:
A B C
3 c c
4 c a
I then assign a value to C based upon a value in B using:
df1.loc[df1.B=='a','C']='d'
The assignment works, but I receive a SettingWithCopy warning. Am I doing something wrong or is this the expected functionality? I thought that using .loc would avoid chained assignment. Is there something that I am missing? I am using Pandas 14.1
#EdChum answer in comments to OP has solved the issue.
i.e. replace
df1=df.loc[df.A=='c']
with
df1=df.loc[df.A=='c'].copy()
this will make it clear your intentions and not raise a warning

Categories