Converting a dataframe from inside a pandas Dataframe - Alice Blue API - python

Newbie alert, please bear
the data i have right now after running the line of code below is
Input-
data = pd.DataFrame(alice._AliceBlue__master_contracts_by_symbol)
Output-
Index NSE
1018GS2026 GS Instrument(exchange='NSE', token=6833, symbol='1018GS2026 GS', name='GOI LOAN 10.18%
2026', expiry=None, lot_size='1')
1025GS2021 GS Instrument(exchange='NSE', token=6819, symbol='1025GS2021 GS', name='GOI LOAN 10.25%
2021', expiry=None, lot_size='1')
116GS2020 GS Instrument(exchange='NSE', token=6814, symbol='116GS2020 GS', name='GOI LOAN 11.60%
2020', expiry=None, lot_size='1')
182D010721 TB Instrument(exchange='NSE', token=1776, symbol='182D010721 TB', name='GOI TBILL 182D-
01/07/21', expiry=None, lot_size='100')
182D020921 TB Instrument(exchange='NSE', token=2593, symbol='182D020921 TB', name='GOI TBILL 182D-
02/09/21', expiry=None, lot_size='100')
I want a dataframe like this from inside the above dataframe
Index Exchange token symbol name expiry lot_size
1018GS2026 GS NSE 6833 1018GS2026 GS GOI LOAN 10.18% 2026 None 1
1025GS2021 GS NSE 6819 1025GS2021 GS GOI LOAN 10.25% 2021 None 1
116GS2020 GS NSE 6814 116GS2020 GS GOI LOAN 11.60% 2020 None 1
182D010721 TB NSE 1776 182D010721 TB GOI TBILL 182D-01/07/21 None 100
182D020921 TB NSE 2593 182D020921 TB GOI TBILL 182D-02/09/21 None 100
Any suggestions guys?, what should i do?

IF the master_contract has the index as its key, and an Instrument object as its value, then it should be easy to convert this.
rows = []
for val in alice._AliceBlue__master_contracts_by_symbol.values():
row.append( [val.exchange, val.token, val.symbol, val.name,
val.expiry, val.lot_size] )
df = pd.DataFrame( rows,
index=alice._AliceBlue__master_contracts_by_symbol.keys(),
columns=['exchange', 'token', 'symbol', 'name', 'expiry', 'lot_size']
)
Edit:
If the AliceBlue thing is really an OrderedDict of OrderedDicts, then it is even easier:
df = pd.DataFrame(
alice._AliceBlue__master_contracts_by_symbol.values(),
index=alice._AliceBlue__master_contracts_by_symbol.keys()
)

The correct way is below
df = pd.DataFrame(alice._AliceBlue__master_contracts_by_symbol)
df = pd.json_normalize([x._asdict() for x in df['NSE']]).set_index(df.index)
Thanks everyone for the help

Related

how to retrieve sector and industry for a list of tickers with python?

I have a list of tickers (below: tick1) that comes from the Earnings Report.
I would like to add the "shortname", "sector" and the "industry" next to the ticker while creating a dataframe.
Unfortunately, the columns are always shuffeling up a bit and they are not matched properly. for instance: VFC --> sector: technology; industry: Semiconductors, which is wrong. It should be sector: Consumer Cyclical; industry: Apparel Manufacturing
Here is my code below: can you please help to adjust it?
---tickers to be read---
import yfinance as yf
with open("/Users/Doc/AB/Earnings/tickers.txt") as fh:
tick1 = fh.read().split()
tickers in txt file
ABOS
ACRX
ADI
ADMP
ADOCY
AER
AGYS
AINV
ALBO
ALLT
AMAT
AMPS
AOZOY
ARCO
AREC
ARZGY
ATAI
AUTO
AVAL
AXDX
BAH
BBAR
BBWI
BHIL
BJ
BKYI
BLBX
BPCGY
BPTH
BRDS
BZFD
CAAP
CAE
CALT
CCHWF
CCSI
CELC
CFRHF
CGEN
CINT
CLSN
CMRX
CRLBF
CRXT
CSCO
CSWI
CVSI
CWBHF
CWBR
DAC
DADA
DE
DECK
DESP
DLO
DOYU
DTST
DUOT
EAST
EBR
EBR.B
EDAP
ENJY
EVTV
EXP
FATH
FL
FLO
FSI
FTK
FUV
FXLV
GAN
GBOX
GDS
GLBE
GLOB
GNLN
GOED
GOGL
GRAB
GRAMF
GRCL
HD
HOOK
HPK
HUYA
HWKN
HYRE
IBEX
IGIC
IKT
IMPL
INLB
INLX
INVO
IONM
IONQ
IPW
IPWR
ISUN
ITCTY
JBI
JD
JHX
JMIA
KALA
KBNT
KEYS
KMDA
KORE
KSLLF
KSS
KULR
LOW
LTRY
LUNA
LVLU
MARK
MBT
MCG
MCLD
MDWD
MDWT
MIGI
MIRO
MNDY
MNMD
MNRO
MSADY
MSGM
MUFG
MVST
NEXCF
NGS
NNOX
NOVN
NRDY
NRGV
NU
NXGN
OBSV
OEG
OMQS
ONON
PANW
PASG
PCYG
PEAR
PLNHF
PLX
PTE
PTN
PXS
QIPT
QRHC
QTEK
QUIK
RCRT
RDY
REE
REED
REKR
RKLB
RMED
RMTI
ROST
RSKD
RYAAY
SANW
SCVL
SDIG
SE
SHLS
SHPW
SHWZ
SLGG
SNPS
SPRO
SQM
SRAD
SSYS
SUNL
SUNW
SUPV
SYN
SYRS
TCEHY
TCRT
TCS
TGI
TGT
THBRF
TJX
TKOMY
TLLTF
TME
TRMR
TSEM
TSHA
TTWO
TXMD
USWS
VBLT
VERB
VEV
VFC
VIPS
VJET
VOXX
VTRU
VVOS
VWE
VYGVF
VYNT
WEBR
WEDXF
WEJO
WIX
WMS
WMT
WRBY
WYY
YALA
YOU
ZIM
---adding the shortname, sector, industry ---
from yahooquery import Ticker
import pandas as pd
symbols = tick1
tickers = Ticker(symbols, asynchronous=True)
datasi = tickers.get_modules("summaryProfile quoteType")
dfsi = pd.DataFrame.from_dict(datasi).T
dataframes = [pd.json_normalize([x for x in dfsi[module] if isinstance(x, dict)]) for
module in ['summaryProfile', 'quoteType']]
dfsi = pd.concat(dataframes, axis=1)
dfsi
import pandas as pd
from yahooquery import Ticker
symbols = ['TSHA', 'GRAMF', 'VFC', 'ABOS', 'INLX', 'INVO', 'IONM', 'IONQ']
tickers = Ticker(symbols, asynchronous=True)
datasi = tickers.get_modules("summaryProfile quoteType")
dfsi = pd.DataFrame.from_dict(datasi).T
dataframes = [pd.json_normalize([x for x in dfsi[module] if isinstance(x, dict)]) for
module in ['summaryProfile', 'quoteType']]
dfsi = pd.concat(dataframes, axis=1)
dfsi = dfsi.set_index('symbol')
dfsi = dfsi.loc[symbols]
print(dfsi[['industry', 'sector']])
Output
industry sector
symbol
TSHA Biotechnology Healthcare
GRAMF Drug Manufacturers—Specialty & Generic Healthcare
VFC Apparel Manufacturing Consumer Cyclical
ABOS Biotechnology Healthcare
INLX Software—Application Technology
INVO Medical Devices Healthcare
IONM Medical Care Facilities Healthcare
IONQ Computer Hardware Technology
Try the following. Set the column'symbol' as indexes.
And send it to the ticker list. Again, you need to check.
I have run the ticker 'VFC' several times: VFC industry---Apparel Manufacturing, sector---Consumer Cyclical.

Concatenating data frames pandas

I would like to historical close prices from the yfinance module and create a data frame with a column with these closing prices for each of the tickers stored in the Holdings list. I can do everything except creating that data frame at the end. Can someone please help?:
Holdings = ['RL', 'AMC', 'BYND', 'BRK-B',
'BBY', 'AYX', 'AAPL', 'KO',
'FB', 'RACE', 'INTC', 'PFE',
'CRM', 'WFC', 'JPM', 'GOOG']
Hist_Holdings = []
for symbol in Holdings:
Ticker = yf.Ticker(symbol)
Hist = Ticker.history(period = "6mo", interval = "1d")
Hist = Hist['Close']
Hist.columns = [symbol]
Hist_Holdings.append(Hist)
The desired data frame format is not known, but the following code will concatenate the stocks you want to get with spaces. It is fast and returns the data in a data frame format. The code below specifies only the closing price.
import yfinance as yf
import datetime
now_ = datetime.datetime.today()
start = datetime.datetime(now_.year, now_.month - 6, now_.day + 1)
end = datetime.datetime(now_.year, now_.month, now_.day - 1)
Holdings = 'RL AMC BYND BRK-B BBY AYX AAPL KO FB RACE INTC PFE CRM WFC JPM GOOG'
data = yf.download(Holdings, start=start, end=end)['Close']
AAPL AMC AYX BBY BRK-B BYND CRM FB GOOG INTC JPM KO PFE RACE RL WFC
Date
2020-06-12 84.699997 5.89 141.130005 77.760002 181.210007 144.740005 175.110001 228.580002 1413.180054 59.330002 99.870003 45.599998 32.020874 167.889999 74.769997 27.969999
2020-06-15 85.747498 5.80 142.940002 80.010002 181.550003 154.000000 178.610001 232.500000 1419.849976 60.099998 101.250000 46.299999 31.650854 169.699997 73.739998 28.209999
2020-06-16 88.019997 5.56 145.690002 83.470001 182.300003 151.940002 180.479996 235.649994 1442.719971 60.400002 102.059998 46.770000 31.688805 169.690002 76.419998 28.520000
2020-06-17 87.897499 5.42 150.990005 83.239998 180.860001 156.339996 181.399994 235.529999 1451.119995 60.490002 99.480003 46.580002 31.840607 169.809998 74.480003 27.450001
2020-06-18 87.932503 5.63 160.779999 82.300003 180.729996 158.199997 187.660004 235.940002 1435.959961 60.080002 98.940002 46.990002 31.537003 168.580002 73.940002 27.549999

A list of ticker to get setor and name

import pandas as pd
import datetime as dt
from pandas_datareader import data as web
import yfinance as yf
yf.pdr_override()
filename=r'C:\Users\User\Desktop\from_python\data_from_python.xlsx'
yeah = pd.read_excel(filename, sheet_name='entry')
stock = []
stock = list(yeah['name'])
stock = [ s.replace('\xa0', '') for s in stock if not pd.isna(s) ]
adj_close=pd.DataFrame([])
high_price=pd.DataFrame([])
low_price=pd.DataFrame([])
volume=pd.DataFrame([])
print(stock)
['^GSPC', 'NQ=F', 'AAU', 'ALB', 'AOS', 'APPS', 'AQB', 'ASPN', 'ATHM', 'AZRE', 'BCYC', 'BGNE', 'CAT', 'CC', 'CLAR', 'CLCT', 'CMBM', 'CMT', 'CRDF', 'CYD', 'DE', 'DKNG', 'EARN', 'EMN', 'FBIO', 'FBRX', 'FCX', 'FLXS', 'FMC', 'FMCI', 'GME', 'GRVY', 'HAIN', 'HBM', 'HIBB', 'IEX', 'IOR', 'KFS', 'MAXR', 'MPX', 'MRTX', 'NSTG', 'NVCR', 'NVO', 'OESX', 'PENN', 'PLL', 'PRTK', 'RDY', 'REGI', 'REKR', 'SBE', 'SQM', 'TCON', 'TCS', 'TGB', 'TPTX', 'TRIL', 'UEC', 'VCEL', 'VOXX', 'WIT', 'WKHS', 'XNCR']
for symbol in stock:
adj_close[symbol] = web.get_data_yahoo([symbol],start,end)['Adj Close']
I have a list of tickers, I have got the adj close price, how can get these tickers NAME and SECTORS?
for single ticker I found in web, it can be done like as below
sbux = yf.Ticker("SBUX")
tlry = yf.Ticker("TLRY")
print(sbux.info['sector'])
print(tlry.info['sector'])
How can I make it as a dataframe that I can put the data into excel as I am doing for adj price.
Thanks a lot!
You can try this answer using a package called yahooquery. Disclaimer: I am the author of the package.
from yahooquery import Ticker
import pandas as pd
symbols = ['^GSPC', 'NQ=F', 'AAU', 'ALB', 'AOS', 'APPS', 'AQB', 'ASPN', 'ATHM', 'AZRE', 'BCYC', 'BGNE', 'CAT', 'CC', 'CLAR', 'CLCT', 'CMBM', 'CMT', 'CRDF', 'CYD', 'DE', 'DKNG', 'EARN', 'EMN', 'FBIO', 'FBRX', 'FCX', 'FLXS', 'FMC', 'FMCI', 'GME', 'GRVY', 'HAIN', 'HBM', 'HIBB', 'IEX', 'IOR', 'KFS', 'MAXR', 'MPX', 'MRTX', 'NSTG', 'NVCR', 'NVO', 'OESX', 'PENN', 'PLL', 'PRTK', 'RDY', 'REGI', 'REKR', 'SBE', 'SQM', 'TCON', 'TCS', 'TGB', 'TPTX', 'TRIL', 'UEC', 'VCEL', 'VOXX', 'WIT', 'WKHS', 'XNCR']
# Create Ticker instance, passing symbols as first argument
# Optional asynchronous argument allows for asynchronous requests
tickers = Ticker(symbols, asynchronous=True)
data = tickers.get_modules("summaryProfile quoteType")
df = pd.DataFrame.from_dict(data).T
# flatten dicts within each column, creating new dataframes
dataframes = [pd.json_normalize([x for x in df[module] if isinstance(x, dict)]) for module in ['summaryProfile', 'quoteType']]
# concat dataframes from previous step
df = pd.concat(dataframes, axis=1)
# View columns
df.columns
Index(['address1', 'address2', 'city', 'state', 'zip', 'country', 'phone',
'fax', 'website', 'industry', 'sector', 'longBusinessSummary',
'fullTimeEmployees', 'companyOfficers', 'maxAge', 'exchange',
'quoteType', 'symbol', 'underlyingSymbol', 'shortName', 'longName',
'firstTradeDateEpochUtc', 'timeZoneFullName', 'timeZoneShortName',
'uuid', 'messageBoardId', 'gmtOffSetMilliseconds', 'maxAge'],
dtype='object')
# Data you're looking for
df[['symbol', 'shortName', 'sector']].head(10)
symbol shortName sector
0 NQZ20.CME Nasdaq 100 Dec 20 NaN
1 ALB Albemarle Corporation Basic Materials
2 AOS A.O. Smith Corporation Industrials
3 ASPN Aspen Aerogels, Inc. Industrials
4 AAU Almaden Minerals, Ltd. Basic Materials
5 ^GSPC S&P 500 NaN
6 ATHM Autohome Inc. Communication Services
7 AQB AquaBounty Technologies, Inc. Consumer Defensive
8 APPS Digital Turbine, Inc. Technology
9 BCYC Bicycle Therapeutics plc Healthcare
It processes stocks and sectors at the same time. However, some stocks do not have a sector, so an error countermeasure is added.
Since the issue column name consists of sector and issue name, we change it to a hierarchical column and update the retrieved data frame. Finally, I save it in CSV format to import it into Excel. I've only tried some of the stocks due to the large number of stocks, so there may be some issues.
import datetime
import pandas as pd
import yfinance as yf
import pandas_datareader.data as web
yf.pdr_override()
start = "2018-01-01"
end = "2019-01-01"
# symbol = ['^GSPC', 'NQ=F', 'AAU', 'ALB', 'AOS', 'APPS', 'AQB', 'ASPN', 'ATHM', 'AZRE', 'BCYC', 'BGNE', 'CAT',
#'CC', 'CLAR', 'CLCT', 'CMBM', 'CMT', 'CRDF', 'CYD', 'DE', 'DKNG', 'EARN', 'EMN', 'FBIO', 'FBRX', 'FCX', 'FLXS',
#'FMC', 'FMCI', 'GME', 'GRVY', 'HAIN', 'HBM', 'HIBB', 'IEX', 'IOR', 'KFS', 'MAXR', 'MPX', 'MRTX', 'NSTG', 'NVCR',
#'NVO', 'OESX', 'PENN', 'PLL', 'PRTK', 'RDY', 'REGI', 'REKR', 'SBE', 'SQM', 'TCON', 'TCS', 'TGB', 'TPTX', 'TRIL',
#'UEC', 'VCEL', 'VOXX', 'WIT', 'WKHS', 'XNCR']
stock = ['^GSPC', 'NQ=F', 'AAU', 'ALB', 'AOS', 'APPS']
adj_close = pd.DataFrame([])
for symbol in stock:
try:
sector = yf.Ticker(symbol).info['sector']
name = yf.Ticker(symbol).info['shortName']
except:
sector = 'None'
name = 'None'
adj_close[sector, symbol] = web.get_data_yahoo(symbol, start=start, end=end)['Adj Close']
idx = pd.MultiIndex.from_tuples(adj_close.columns)
adj_close.columns = idx
adj_close.head()
None Basic Materials Industrials Technology
^GSPC_None NQ=F_None AAU_None ALB_Albemarle Corporation AOS_A.O. Smith Corporation APPS_Digital Turbine, Inc.
2018-01-02 2695.810059 6514.75 1.03 125.321663 58.657742 1.79
2018-01-03 2713.060059 6584.50 1.00 125.569397 59.010468 1.87
2018-01-04 2723.989990 6603.50 0.98 124.073502 59.286930 1.86
2018-01-05 2743.149902 6667.75 1.00 125.502716 60.049587 1.96
2018-01-08 2747.709961 6688.00 0.95 130.962250 60.335583 1.96
# for excel
adj_close.to_csv('stock.csv', sep=',')

Transform data to growth rates in Python

I have two variables and I want to express one of them (monetary_base) in terms of monthly growth.
How can I do that?. In the R language you should first transform the data into time series, in Python is this also the case?
#LLamando a las series que buscamos
inflacion = llamada_api('https://api.estadisticasbcra.com/inflacion_mensual_oficial')
base_monetaria = llamada_api('https://api.estadisticasbcra.com/base')
#Armando DataFrames
df = pd.DataFrame(inflacion)
df_bm = pd.DataFrame(base_monetaria)
#Renombrando columnas
df = df.rename(columns={'d':'Fecha',
'v':'IPC'})
df_bm = df_bm.rename(columns={'d':'Fecha',
'v':'base_monetaria'})
#Arreglando tipo de datos
df['Fecha']=pd.to_datetime(df['Fecha'])
df_bm['Fecha']=pd.to_datetime(df_bm['Fecha'])
#Verificando que las fechas esten en formato date
df['Fecha'].dtype
df_bm['Fecha'].dtype
#Filtrando
df_ipc = df[(df['Fecha'] > '2002-12-31')]
df_bm_filter = df_bm[(df_bm['Fecha'] > '2002-12-31')]
#Graficando
plt.figure(figsize=(14,12))
df_ipc.plot(x = 'Fecha', y = 'IPC')
plt.title('IPC-Mensual', fontdict={'fontsize':20})
plt.ylabel('IPC')
plt.xticks(rotation=45)
plt.show()
The data looks like this
Fecha base_monetaria
1748 2003-01-02 29302
1749 2003-01-03 29360
1750 2003-01-06 29524
1751 2003-01-07 29867
1752 2003-01-08 29957
... ...
5966 2020-02-18 1941302
5967 2020-02-19 1941904
5968 2020-02-20 1887975
5969 2020-02-21 1855477
5970 2020-02-26 1807042
The idea is to take the data for the last day of the month and calculate the growth rate with the data for the last day of the previous month.
You can try something like this
from pandas.tseries.offsets import MonthEnd
import pandas as pd
df = pd.DataFrame({'Fecha': ['2020-01-31', '2020-02-29', '2020-03-31', '2020-05-31', '2020-04-30', '2020-07-31', '2020-06-30', '2020-08-31', '2020-09-30', '2020-10-31', '2020-11-30', '2020-12-31'],
'price': ['32132', '54321', '3213121', '432123', '32132', '54321', '32132', '54321', '3213121', '432123', '32132', '54321']})
df['Fecha'] = df['Fecha'].astype('datetime64[ns]')
df['is_month_end'] = df['Fecha'].dt.is_month_end
df = df[df['is_month_end'] == True]
df.sort_values('Fecha',inplace=True)
df.reset_index(drop=True, inplace = True)
def change(x,y):
try:
index = df[df['Fecha']==y].index.item()
last = df.loc[index-1][1]
return float(x)/float(last)
except:
return 0
df['new_column'] = df.apply(lambda row: change(row['price'],row['Fecha']), axis=1)
df.head(12)
Assuming the base_moetaria is a monthly cumulative value then
df = pd.DataFrame({'Fecha': ['2020-01-31', '2020-02-29', '2020-03-31', '2020-05-31', '2020-04-30', '2020-07-31', '2020-06-30', '2020-08-31', '2020-09-30', '2020-10-31', '2020-11-30', '2020-12-31'],
'price': [32132, 54321, 3213121, 432123, 32132, 54321, 32132, 54321, 3213121, 432123, 32132, 54321]})
df['Fecha'] = pd.to_datetime(df['Fecha'])
df.set_index('Fecha', inplace=True)
new_df = df.groupby(pd.Grouper(freq="M")).tail(1).reset_index()
new_df['rate'] = (new_df['price'] -new_df['price'].shift(1))/new_df['price'].shift(1)
The new_df['rate'] will give you the growth rate the way you explained in the comment below
The problem can be solve creating a column with the lag values of base_monetaria
df_bm_filter['is_month_end'] = df_bm_filter['Fecha'].dt.is_month_end
df_last_date = df_bm_filter[df_bm_filter['is_month_end'] == True]
df_last_date['base_monetaria_lag'] = df_last_date['base_monetaria'].shift(1)
df_last_date['bm_growth'] = (df_last_date['base_monetaria'] - df_last_date['base_monetaria_lag']) / df_last_date['base_monetaria_lag']

Try to include a column based on input and file name in Pandas Dataframe in Python

I have a several csv files which have the following structure:
Erster Hoch Tief Schlusskurs Stuecke Volumen
Datum
14.02.2017 151.55 152.35 151.05 152.25 110.043 16.687.376
13.02.2017 149.85 152.20 149.25 151.25 415.76 62.835.200
10.02.2017 149.00 150.05 148.65 149.40 473.664 70.746.088
09.02.2017 144.75 148.45 144.35 148.00 642.175 94.348.392
Erster Hoch Tief Schlusskurs Stuecke Volumen
Datum
14.02.2017 111.454 111.776 111.454 111.776 44 4.918
13.02.2017 110.570 110.989 110.570 110.989 122 13.535
10.02.2017 109.796 110.705 109.796 110.705 0 0
09.02.2017 107.993 108.750 107.993 108.750 496 53.933
all are different based on the naming of the file name:
wkn_A1EWWW_historic.csv
wkn_A0YAQA_historic.csv
I want to have the following Output:
Date wkn Open High low Close pieced Volume
14.02.2017 A1EWWW 151.55 152.35 151.05 152.25 110.043 16.687.376
13.02.2017 A1EWWW 149.85 152.20 149.25 151.25 415.76 62.835.200
10.02.2017 A1EWWW 149.00 150.05 148.65 149.40 473.664 70.746.088
09.02.2017 A1EWWW 144.75 148.45 144.35 148.00 642.175 94.348.392
Date wkn Open High low Close pieced Volume
14.02.2017 A0YAQA 111.454 111.776 111.454 111.776 44 4.918
13.02.2017 A0YAQA 110.570 110.989 110.570 110.989 122 13.535
10.02.2017 A0YAQA 109.796 110.705 109.796 110.705 0 0
09.02.2017 A0YAQA 107.993 108.750 107.993 108.750 496 53.933
The code looks like the following:
import pandas as pd
wkn_list_dummy = {'A0YAQA','A1EWWW'}
for w_list in wkn_list_dummy:
url = 'C:/wkn_'+str(w_list)+'_historic.csv'
df = pd.read_csv(url, encoding='cp1252', sep=';', decimal=',', index_col=0)
print(df)
I tried using melt but it was not working.
You can add column by just assigning a value to it:
df['new_column'] = 'string'
All together:
import pandas as pd
wkn_list_dummy = {'A0YAQA','A1EWWW'}
final_df = pd.DataFrame()
for w_list in wkn_list_dummy:
url = 'C:/wkn_'+str(w_list)+'_historic.csv'
df = pd.read_csv(url, encoding='cp1252', sep=';', decimal=',', index_col=0)
df['wkn'] = w_list
final_df = final_df.append(df)
final_df.reset_index(inplace=True)
print(final_df)

Categories