I have DataFrame with two columns:
col1 | col2
20 EUR
31 GBP
5 JPY
I may have 10000 rows like this
How to do fast currency conversion to base currency being GBP?
should I use easymoney?
I know how to apply conversion to single row but I do not know how to iterate through all the rows fast.
EDIT:
I would like to apply sth as:
def convert_currency(amount, currency_symbol):
converted = ep.currency_converter(amount=1000, from_currency=currency_symbol, to_currency="GBP")
return converted
df.loc[df.currency != 'GBP', 'col1'] = convert_currency(currency_data.col1, df.col2
)
but it does not work yet.
Join a third column with the conversion rates for each currency, joining on the currency code in col2. Then create a column with the translated amount.
dfRate:
code | rate
EUR 1.123
USD 2.234
df2 = pd.merge(df1, dfRate, how='left', left_on=['col2'], right_on=['code'])
df2['translatedAmt'] = df2['col1'] / df2['rate']
df = pd.DataFrame([[20, 'EUR'], [31, 'GBP'], [5, 'JPY']], columns=['value', 'currency'])
print df
value currency
0 20 EUR
1 31 GBP
2 5 JPY
def convert_to_gbp(args): # placeholder for your fancy conversion function
amount, currency = args
rates = {'EUR': 2, 'JPY': 10, 'GBP': 1}
return rates[currency] * amount
df.assign(**{'In GBP': df.apply(convert_to_gbp, axis=1)})
value currency In GBP
0 20 EUR 40
1 31 GBP 31
2 5 JPY 50
Related
Let me simplify this.
I have currency conversation key DataFrame, it has 2 columns. Currency and Spot FX Rate in USD.
currency fx_rate
INR 0.013223
JPY 0.008653
MYR 0.239000
CNY 0.157300
HKD 0.128160
My 2nd DataFrame is
currency ID amount outstanding
INR 78waf 1000000000
JPY 48waf 100000000000
MYR 38waf 10000000
CNY 28waf 1000000000
HKD 18waf 10000000
How would I create a Fourth Column in my 2nd DF ("CONVERTED AMOUNT") based on Currency Key DataFrame? You would need to Multiple for the conversion (for INR: 0.013223 * 1000000000 = 13223000)
end goal:
currency ID amount outstanding CONVERTED AMOUNT
INR 78waf 1000000000 13223000
JPY 48waf 100000000000 etc..
MYR 38waf 10000000 ...
CNY 28waf 1000000000 ...
HKD 18waf 10000000 ...
Your two last dataframes seem to have more columns than they should.
Anyway, if I understand your problem correctly, I will probably just go like this:
df["CONVERTED AMOUNT"] = df["amount"] * 0.013223
You could map "fx_rate" to "currency" column in df2 and multiply with "amount_outstanding":
df2['converted_amount'] = df2['currency'].map(df1.set_index('currency')['fx_rate']) * df2['amount_outstanding']
You could also set_index with "currency" and multiply the amounts on matching currencies and reset_index:
df2 = df2.set_index('currency')
df2['converted_amount'] = df2['amount_outstanding'].mul(df1.set_index('currency')['fx_rate'])
df2 = df2.reset_index()
Another options is to merge the DataFrames on "currency" and multiply the relevant columns:
merged = df2.merge(df1, on='currency')
merged['converted_amount'] = merged['amount_outstanding'] * merged['fx_rate']
out = merged.drop(columns=['fx_rate'])
Output:
currency ID amount_outstanding converted_amount
0 INR 78waf 1000000000 13223000.0
1 JPY 48waf 100000000000 865300000.0
2 MYR 38waf 10000000 2390000.0
3 CNY 28waf 1000000000 157300000.0
4 HKD 18waf 10000000 1281600.0
df = pd.merge(df2, df1, how = 'left', on = 'currency')
df['converted_amount'] = df['amount_outstanding'] * df['fx_rate']
I have a table with a list of currencies as column in df1['Ccy]
Ccy
0 GBP
1 GBP
2 USD
3 EUR
4 GBP
5 USD
I have a second table that has values, df2:
Ccy FX Rate
0 USD 0.750244
1 JPY 0.007196
2 GBP 1.000000
3 EUR 0.893390
How can I create a mapping as a new column in df1 that has the FX Rate column values per respective currency, e.g:
Ccy FX Rate
0 GBP 1.000000
1 GBP 1.000000
2 USD 0.750244
3 EUR 0.893390
4 GBP 1.000000
5 USD 0.750244
I could do a mapping like the below but it replaces original currencies rather than creates a new column with the mapped numerical values:
rename_dict = df2.set_index('Ccy').to_dict()['FX Rate']
df1 = df1.replace(rename_dict)
Want to simply add the mappings as a new column to the original df1.
Thanks!
You can merge two dataframes on "Ccy".
newdf = pd.merge(df1,df2,on="Ccy",how="left")
I have 2 dfs:
df2
dec_pl cur_key
0 JPY
1 HKD
df1
cur amount
JPY 80
HKD 20
USD 70
I like to reference del_pl in df2 for 'cur' in df1, and calculate df1.converted_amount = df1.amount * 10 ** (2 - df2.dec_pl) for df1; i.e. df1.amount times the 10 to the power of (2 - df2.dec_pl) and if there cannot find a corresponding df2.cur_key from df1.cur, e.g. USD, then just use its amount;
df1 = df1.set_index('cur')
df2 = df2.set_index('cur_key')
df1['converted_amount'] = (df1.amount*10**(2 - df2.dec_pl)).fillna(df1['amount'], downcast='infer')
but i got
ValueError: cannot reindex from a duplicate axis
I am wondering whats the best way to do this, so the results should look like,
df1
cur amount converted_amount
JPY 80 8000
HKD 20 200
USD 70 70
On possible problem is duplicates in cur_key column, like:
print (df2)
dec_pl cur_key
0 0 HKD
1 1 HKD
df1 = df1.set_index('cur')
Solutions are aggregation duplicates for unique cur_key - e.g. by sum:
df2 = df2.groupby('cur_key').sum()
Or remove duplicates - keep only first or last values per cur_key:
#first default value
df2 = df2.drop_duplicates('cur_key').set_index('cur_key')
#last value
#df2 = df2.drop_duplicates('cur_key', keep='last').set_index('cur_key')
df1['converted_amount'] = (df1.amount*10**(2 - df2.dec_pl)).fillna(df1['amount'], downcast='infer')
print (df1)
amount converted_amount
cur
JPY 80 80
HKD 20 200
USD 70 70
I am trying to perform an action in Python which is very similar to VLOOKUP in Excel. There have been many questions related to this on StackOverflow but they are all slightly different from this use case. Hopefully anyone can guide me in the right direction. I have the following two pandas dataframes:
df1 = pd.DataFrame({'Invoice': ['20561', '20562', '20563', '20564'],
'Currency': ['EUR', 'EUR', 'EUR', 'USD']})
df2 = pd.DataFrame({'Ref': ['20561', 'INV20562', 'INV20563BG', '20564'],
'Type': ['01', '03', '04', '02'],
'Amount': ['150', '175', '160', '180'],
'Comment': ['bla', 'bla', 'bla', 'bla']})
print(df1)
Invoice Currency
0 20561 EUR
1 20562 EUR
2 20563 EUR
3 20564 USD
print(df2)
Ref Type Amount Comment
0 20561 01 150 bla
1 INV20562 03 175 bla
2 INV20563BG 04 160 bla
3 20564 02 180 bla
Now I would like to create a new dataframe (df3) where I combine the two based on the invoice numbers. The problem is that the invoice numbers are not always a "full match", but sometimes a "partial match" in df2['Ref']. So the joining on 'Invoice' does not give the desired output because it doesn't copy the data for invoices 20562 & 20563, see below:
df3 = df1.join(df2.set_index('Ref'), on='Invoice')
print(df3)
Invoice Currency Type Amount Comment
0 20561 EUR 01 150 bla
1 20562 EUR NaN NaN NaN
2 20563 EUR NaN NaN NaN
3 20564 USD 02 180 bla
Is there a way to join on a partial match? I know how to "clean" df2['Ref'] with regex, but that is not the solution I am after. With a for loop, I get a long way but this isn't very Pythonic.
df4 = df1.copy()
for i, row in df1.iterrows():
tmp = df2[df2['Ref'].str.contains(row['Invoice'])]
df4.loc[i, 'Amount'] = tmp['Amount'].values[0]
print(df4)
Invoice Currency Amount
0 20561 EUR 150
1 20562 EUR 175
2 20563 EUR 160
3 20564 USD 180
Can str.contains() somehow be used in a more elegant way? Thank you so much in advance for your help!
This is one way using pd.Series.apply, which is just a thinly veiled loop. A "partial string merge" is what you are looking for, I'm not sure it exists in a vectorised form.
df4 = df1.copy()
def get_amount(x):
return df2.loc[df2['Ref'].str.contains(x), 'Amount'].iloc[0]
df4['Amount'] = df4['Invoice'].apply(get_amount)
print(df4)
Currency Invoice Amount
0 EUR 20561 150
1 EUR 20562 175
2 EUR 20563 160
3 USD 20564 180
Here are two alternative solutions, both using Pandas' merge.
# Solution 1 (checking directly if 'Invoice' string is in the 'Ref' string)
df4 = df2.copy()
df4['Invoice'] = [val for idx, val in enumerate(df1['Invoice']) if val in df2['Ref'][idx]]
df_m4 = df1.merge(df4[['Amount', 'Invoice']], on='Invoice')
# Solution 2 (regex)
import re
df5 = df2.copy()
df5['Invoice'] = [re.findall(r'(\d{5})', s)[0] for s in df2['Ref']]
df_m5 = df1.merge(df5[['Amount', 'Invoice']], on='Invoice')
Both df_m4 and df_m5 will print
Currency Invoice Amount
0 EUR 20561 150
1 EUR 20562 175
2 EUR 20563 160
3 USD 20564 180
Note: The regex solution presented assumes that the invoice numbers are always 5 digits and only takes the first of such occurrences. Solution 1 is more robust, as it directly compares the strings.
The regex solution could be improved to be more robust if needed though.
I am going crazy about this one. I am trying to add a new column to a data frame DF1, based on values found in another data frame DF2. This is how they look,
DF1=
Date Amount Currency
0 2014-08-20 -20000000 EUR
1 2014-08-20 -12000000 CAD
2 2014-08-21 10000 EUR
3 2014-08-21 20000 USD
4 2014-08-22 25000 USD
DF2=
NAME OPEN
0 EUR 10
1 CAD 20
2 USD 30
Now, I would like to create a new column in DF1, named 'Amount (Local)', where each amount in 'Amount' is multipled with the correct matching value found in DF2 yielding a result,
DF1=
Date Amount Currency Amount (Local)
0 2014-08-20 -20000000 EUR -200000000
1 2014-08-20 -12000000 CAD -240000000
2 2014-08-21 10000 EUR 100000
3 2014-08-21 20000 USD 600000
4 2014-08-22 25000 USD 750000
If there exists a method for adding a column to DF1 based on a function, instead of just multiplication as the above problem, that would be very much appreciated also.
Thanks,
You can use map from a dict of your second df (in my case it is called df1. yours is DF2), and then multiply the result of this by the amount:
In [65]:
df['Amount (Local)'] = df['Currency'].map(dict(df1[['NAME','OPEN']].values)) * df['Amount']
df
Out[65]:
Date Amount Currency Amount (Local)
index
0 2014-08-20 -20000000 EUR -200000000
1 2014-08-20 -12000000 CAD -240000000
2 2014-08-21 10000 EUR 100000
3 2014-08-21 20000 USD 600000
4 2014-08-22 25000 USD 750000
So breaking this down, map will match the value against the value in the dict key, in this case we are matching Currency against the NAME key, the value in the dict is the OPEN values, the result of this would be:
In [66]:
df['Currency'].map(dict(df1[['NAME','OPEN']].values))
Out[66]:
index
0 10
1 20
2 10
3 30
4 30
Name: Currency, dtype: int64
We then simply multiply this series against the Amount column from df (DF1 in your case) to get the desired result.
Use fancy-indexing to create a currency array aligned with your data in df1, then use it in multiplication, and assign the result to a new column in df1:
import pandas as pd
ccy_series = pd.Series([10,20,30], index=['EUR', 'CAD', 'USD'])
df1 = pd.DataFrame({'amount': [-200, -120, 1, 2, 2.5], 'ccy': ['EUR', 'CAD', 'EUR', 'USD', 'USD']})
aligned_ccy = ccy_series[df1.ccy].reset_index(drop=True)
aligned_ccy
=>
0 10
1 20
2 10
3 30
4 30
dtype: int64
df1['amount_local'] = df1.amount *aligned_ccy
df1
=>
amount ccy amount_local
0 -200.0 EUR -2000
1 -120.0 CAD -2400
2 1.0 EUR 10
3 2.0 USD 60
4 2.5 USD 75