Concatenating data frames pandas - python

I would like to historical close prices from the yfinance module and create a data frame with a column with these closing prices for each of the tickers stored in the Holdings list. I can do everything except creating that data frame at the end. Can someone please help?:
Holdings = ['RL', 'AMC', 'BYND', 'BRK-B',
'BBY', 'AYX', 'AAPL', 'KO',
'FB', 'RACE', 'INTC', 'PFE',
'CRM', 'WFC', 'JPM', 'GOOG']
Hist_Holdings = []
for symbol in Holdings:
Ticker = yf.Ticker(symbol)
Hist = Ticker.history(period = "6mo", interval = "1d")
Hist = Hist['Close']
Hist.columns = [symbol]
Hist_Holdings.append(Hist)

The desired data frame format is not known, but the following code will concatenate the stocks you want to get with spaces. It is fast and returns the data in a data frame format. The code below specifies only the closing price.
import yfinance as yf
import datetime
now_ = datetime.datetime.today()
start = datetime.datetime(now_.year, now_.month - 6, now_.day + 1)
end = datetime.datetime(now_.year, now_.month, now_.day - 1)
Holdings = 'RL AMC BYND BRK-B BBY AYX AAPL KO FB RACE INTC PFE CRM WFC JPM GOOG'
data = yf.download(Holdings, start=start, end=end)['Close']
AAPL AMC AYX BBY BRK-B BYND CRM FB GOOG INTC JPM KO PFE RACE RL WFC
Date
2020-06-12 84.699997 5.89 141.130005 77.760002 181.210007 144.740005 175.110001 228.580002 1413.180054 59.330002 99.870003 45.599998 32.020874 167.889999 74.769997 27.969999
2020-06-15 85.747498 5.80 142.940002 80.010002 181.550003 154.000000 178.610001 232.500000 1419.849976 60.099998 101.250000 46.299999 31.650854 169.699997 73.739998 28.209999
2020-06-16 88.019997 5.56 145.690002 83.470001 182.300003 151.940002 180.479996 235.649994 1442.719971 60.400002 102.059998 46.770000 31.688805 169.690002 76.419998 28.520000
2020-06-17 87.897499 5.42 150.990005 83.239998 180.860001 156.339996 181.399994 235.529999 1451.119995 60.490002 99.480003 46.580002 31.840607 169.809998 74.480003 27.450001
2020-06-18 87.932503 5.63 160.779999 82.300003 180.729996 158.199997 187.660004 235.940002 1435.959961 60.080002 98.940002 46.990002 31.537003 168.580002 73.940002 27.549999

Related

convert UNIX timestamp to US/Central using tz_convert

I'm trying to convert my UNIX timestamp to the US/Central timezone timestamp, but i keep getting the UTC output. I don't know what i'm doing wrong in the code.
import ccxt
import pandas as pd
from dateutil import tz
binance = ccxt.binance({
'enableRateLimit': True,
'apiKey': 'xxxxxxxxxxxxxxxxxxx',
'secret': 'xxxxxxxxxxxxx'
})
symbol = 'ETHUSDT'
timeframe = '15m'
limit = 500
bars = binance.fetch_ohlcv (symbol, timeframe = timeframe, limit = limit)
df = pd.DataFrame(bars, columns = ['timestamp','open','high','low', 'close', 'volume'])
df['timestamp'] = pd.to_datetime(df['timestamp'], unit = 'ms').dt.tz_localize(tz='US/Central')
df['timestamp'] = pd.to_datetime(df['timestamp'], unit = 'ms').dt.tz_convert(tz='US/Central')
print(df)
timestamp open high low close volume
0 2022-11-21 12:15:00-06:00 1120.63 1122.74 1118.26 1119.31 3278.5060
1 2022-11-21 12:30:00-06:00 1119.30 1127.48 1115.10 1125.31 11065.4442
2 2022-11-21 12:45:00-06:00 1125.32 1128.36 1123.92 1127.30 5447.6054
3 2022-11-21 13:00:00-06:00 1127.30 1136.75 1125.67 1133.81 15977.1500
4 2022-11-21 13:15:00-06:00 1133.82 1146.99 1132.77 1139.39 21009.7356
.. ... ... ... ... ... ...
495 2022-11-26 16:00:00-06:00 1210.90 1212.87 1208.77 1212.54 3822.1327
496 2022-11-26 16:15:00-06:00 1212.55 1213.92 1212.09 1213.63 2414.2695
497 2022-11-26 16:30:00-06:00 1213.62 1213.63 1211.05 1212.89 2461.4644
498 2022-11-26 16:45:00-06:00 1212.89 1212.94 1209.00 1209.76 2544.8965
499 2022-11-26 17:00:00-06:00 1209.75 1210.00 1207.74 1209.77 1638.1446
I think you want.
df["timestamp"] = (
pd.to_datetime(df["timestamp"], unit="ms")
.dt.tz_localize("UTC")
.dt.tz_convert("US/Central")
.dt.tz_localize(None)
)

how to retrieve sector and industry for a list of tickers with python?

I have a list of tickers (below: tick1) that comes from the Earnings Report.
I would like to add the "shortname", "sector" and the "industry" next to the ticker while creating a dataframe.
Unfortunately, the columns are always shuffeling up a bit and they are not matched properly. for instance: VFC --> sector: technology; industry: Semiconductors, which is wrong. It should be sector: Consumer Cyclical; industry: Apparel Manufacturing
Here is my code below: can you please help to adjust it?
---tickers to be read---
import yfinance as yf
with open("/Users/Doc/AB/Earnings/tickers.txt") as fh:
tick1 = fh.read().split()
tickers in txt file
ABOS
ACRX
ADI
ADMP
ADOCY
AER
AGYS
AINV
ALBO
ALLT
AMAT
AMPS
AOZOY
ARCO
AREC
ARZGY
ATAI
AUTO
AVAL
AXDX
BAH
BBAR
BBWI
BHIL
BJ
BKYI
BLBX
BPCGY
BPTH
BRDS
BZFD
CAAP
CAE
CALT
CCHWF
CCSI
CELC
CFRHF
CGEN
CINT
CLSN
CMRX
CRLBF
CRXT
CSCO
CSWI
CVSI
CWBHF
CWBR
DAC
DADA
DE
DECK
DESP
DLO
DOYU
DTST
DUOT
EAST
EBR
EBR.B
EDAP
ENJY
EVTV
EXP
FATH
FL
FLO
FSI
FTK
FUV
FXLV
GAN
GBOX
GDS
GLBE
GLOB
GNLN
GOED
GOGL
GRAB
GRAMF
GRCL
HD
HOOK
HPK
HUYA
HWKN
HYRE
IBEX
IGIC
IKT
IMPL
INLB
INLX
INVO
IONM
IONQ
IPW
IPWR
ISUN
ITCTY
JBI
JD
JHX
JMIA
KALA
KBNT
KEYS
KMDA
KORE
KSLLF
KSS
KULR
LOW
LTRY
LUNA
LVLU
MARK
MBT
MCG
MCLD
MDWD
MDWT
MIGI
MIRO
MNDY
MNMD
MNRO
MSADY
MSGM
MUFG
MVST
NEXCF
NGS
NNOX
NOVN
NRDY
NRGV
NU
NXGN
OBSV
OEG
OMQS
ONON
PANW
PASG
PCYG
PEAR
PLNHF
PLX
PTE
PTN
PXS
QIPT
QRHC
QTEK
QUIK
RCRT
RDY
REE
REED
REKR
RKLB
RMED
RMTI
ROST
RSKD
RYAAY
SANW
SCVL
SDIG
SE
SHLS
SHPW
SHWZ
SLGG
SNPS
SPRO
SQM
SRAD
SSYS
SUNL
SUNW
SUPV
SYN
SYRS
TCEHY
TCRT
TCS
TGI
TGT
THBRF
TJX
TKOMY
TLLTF
TME
TRMR
TSEM
TSHA
TTWO
TXMD
USWS
VBLT
VERB
VEV
VFC
VIPS
VJET
VOXX
VTRU
VVOS
VWE
VYGVF
VYNT
WEBR
WEDXF
WEJO
WIX
WMS
WMT
WRBY
WYY
YALA
YOU
ZIM
---adding the shortname, sector, industry ---
from yahooquery import Ticker
import pandas as pd
symbols = tick1
tickers = Ticker(symbols, asynchronous=True)
datasi = tickers.get_modules("summaryProfile quoteType")
dfsi = pd.DataFrame.from_dict(datasi).T
dataframes = [pd.json_normalize([x for x in dfsi[module] if isinstance(x, dict)]) for
module in ['summaryProfile', 'quoteType']]
dfsi = pd.concat(dataframes, axis=1)
dfsi
import pandas as pd
from yahooquery import Ticker
symbols = ['TSHA', 'GRAMF', 'VFC', 'ABOS', 'INLX', 'INVO', 'IONM', 'IONQ']
tickers = Ticker(symbols, asynchronous=True)
datasi = tickers.get_modules("summaryProfile quoteType")
dfsi = pd.DataFrame.from_dict(datasi).T
dataframes = [pd.json_normalize([x for x in dfsi[module] if isinstance(x, dict)]) for
module in ['summaryProfile', 'quoteType']]
dfsi = pd.concat(dataframes, axis=1)
dfsi = dfsi.set_index('symbol')
dfsi = dfsi.loc[symbols]
print(dfsi[['industry', 'sector']])
Output
industry sector
symbol
TSHA Biotechnology Healthcare
GRAMF Drug Manufacturers—Specialty & Generic Healthcare
VFC Apparel Manufacturing Consumer Cyclical
ABOS Biotechnology Healthcare
INLX Software—Application Technology
INVO Medical Devices Healthcare
IONM Medical Care Facilities Healthcare
IONQ Computer Hardware Technology
Try the following. Set the column'symbol' as indexes.
And send it to the ticker list. Again, you need to check.
I have run the ticker 'VFC' several times: VFC industry---Apparel Manufacturing, sector---Consumer Cyclical.

builtin_function_or_method' object is not iterable

I have the following function to calculate the simple moving average for my data frame.
## data from
import yfinance as yf
symbols=[ 'SPY', 'TSLA', 'AAPL', 'CAKE', 'JBLU', 'MSFT']
data = yf.download(symbols, start="2015-01-01", end="2021-04-20")
def trend(df, fast, slow):
Dataset= pd.DataFrame(index=df.index)
Dataset['SlowSMA'] = df['Close'].rolling(slow).mean()
Dataset['FastSMA'] = df['Close'].rolling(fast).mean()
return Dataset
This works fine if I have one stock as there is only one close, however, it gives me the following issue
builtin_function_or_method' object is not iterable
I want to iterate in the data for each ticker in the data frame and save it in the Dataset by its name so I know which one is which.
I've modified a bit your code but this works :
import yfinance as yf
symbols=[ 'SPY', 'TSLA', 'AAPL', 'CAKE', 'JBLU', 'MSFT']
data = yf.download(symbols, start="2015-01-01", end="2021-04-20")
def stock_trend(df, fast, slow):
Dataset= pd.DataFrame(index=df.index)
Dataset['SlowSMA']= df.rolling(slow).mean().values
Dataset['FastSMA']= df.rolling(fast).mean().values
return Dataset
def all_stock_trends(df, fast, slow, stocks: list, value_type: str):
allst = []
for stock in stocks:
data = df[value_type][stock]
st = stock_trend(data, fast, slow)
allst.append(st)
all_st = pd.concat(allst, axis=1)
all_st.columns = pd.MultiIndex.from_product([stocks, ['SlowSMA', 'FastSMA']])
return all_st
Then you can do :
all_stocks_close = all_stock_trends(data, '1d', '1d', symbols, 'Close')
Which gives the following as output :
SPY TSLA AAPL CAKE JBLU MSFT
SlowSMA FastSMA SlowSMA FastSMA SlowSMA FastSMA SlowSMA FastSMA SlowSMA FastSMA SlowSMA FastSMA
Date
2014-12-31 205.539993 205.539993 44.481998 44.481998 27.594999 27.594999 50.310001 50.310001 15.860000 15.860000 46.450001 46.450001
2015-01-02 205.429993 205.429993 43.862000 43.862000 27.332500 27.332500 50.110001 50.110001 15.790000 15.790000 46.759998 46.759998
2015-01-05 201.720001 201.720001 42.018002 42.018002 26.562500 26.562500 50.200001 50.200001 15.220000 15.220000 46.330002 46.330002
2015-01-06 199.820007 199.820007 42.256001 42.256001 26.565001 26.565001 49.820000 49.820000 14.970000 14.970000 45.650002 45.650002
2015-01-07 202.309998 202.309998 42.189999 42.189999 26.937500 26.937500 51.700001 51.700001 15.100000 15.100000 46.230000 46.230000
... ... ... ... ... ... ... ... ... ... ... ... ...
2021-04-13 412.859985 412.859985 762.320007 762.320007 134.429993 134.429993 57.610001 57.610001 20.770000 20.770000 258.489990 258.489990
2021-04-14 411.450012 411.450012 732.229980 732.229980 132.029999 132.029999 58.200001 58.200001 20.830000 20.830000 255.589996 255.589996
2021-04-15 415.869995 415.869995 738.849976 738.849976 134.500000 134.500000 57.689999 57.689999 20.690001 20.690001 259.500000 259.500000
2021-04-16 417.260010 417.260010 739.780029 739.780029 134.160004 134.160004 57.840000 57.840000 20.299999 20.299999 260.739990 260.739990
2021-04-19 415.209991 415.209991 714.630005 714.630005 134.839996 134.839996 58.200001 58.200001 20.190001 20.190001 258.739990 258.739990
1585 rows × 12 columns
To access the data for one stock you can then simply do :
SPY_stock = all_stocks_close['SPY']

Converting a dataframe from inside a pandas Dataframe - Alice Blue API

Newbie alert, please bear
the data i have right now after running the line of code below is
Input-
data = pd.DataFrame(alice._AliceBlue__master_contracts_by_symbol)
Output-
Index NSE
1018GS2026 GS Instrument(exchange='NSE', token=6833, symbol='1018GS2026 GS', name='GOI LOAN 10.18%
2026', expiry=None, lot_size='1')
1025GS2021 GS Instrument(exchange='NSE', token=6819, symbol='1025GS2021 GS', name='GOI LOAN 10.25%
2021', expiry=None, lot_size='1')
116GS2020 GS Instrument(exchange='NSE', token=6814, symbol='116GS2020 GS', name='GOI LOAN 11.60%
2020', expiry=None, lot_size='1')
182D010721 TB Instrument(exchange='NSE', token=1776, symbol='182D010721 TB', name='GOI TBILL 182D-
01/07/21', expiry=None, lot_size='100')
182D020921 TB Instrument(exchange='NSE', token=2593, symbol='182D020921 TB', name='GOI TBILL 182D-
02/09/21', expiry=None, lot_size='100')
I want a dataframe like this from inside the above dataframe
Index Exchange token symbol name expiry lot_size
1018GS2026 GS NSE 6833 1018GS2026 GS GOI LOAN 10.18% 2026 None 1
1025GS2021 GS NSE 6819 1025GS2021 GS GOI LOAN 10.25% 2021 None 1
116GS2020 GS NSE 6814 116GS2020 GS GOI LOAN 11.60% 2020 None 1
182D010721 TB NSE 1776 182D010721 TB GOI TBILL 182D-01/07/21 None 100
182D020921 TB NSE 2593 182D020921 TB GOI TBILL 182D-02/09/21 None 100
Any suggestions guys?, what should i do?
IF the master_contract has the index as its key, and an Instrument object as its value, then it should be easy to convert this.
rows = []
for val in alice._AliceBlue__master_contracts_by_symbol.values():
row.append( [val.exchange, val.token, val.symbol, val.name,
val.expiry, val.lot_size] )
df = pd.DataFrame( rows,
index=alice._AliceBlue__master_contracts_by_symbol.keys(),
columns=['exchange', 'token', 'symbol', 'name', 'expiry', 'lot_size']
)
Edit:
If the AliceBlue thing is really an OrderedDict of OrderedDicts, then it is even easier:
df = pd.DataFrame(
alice._AliceBlue__master_contracts_by_symbol.values(),
index=alice._AliceBlue__master_contracts_by_symbol.keys()
)
The correct way is below
df = pd.DataFrame(alice._AliceBlue__master_contracts_by_symbol)
df = pd.json_normalize([x._asdict() for x in df['NSE']]).set_index(df.index)
Thanks everyone for the help

A list of ticker to get setor and name

import pandas as pd
import datetime as dt
from pandas_datareader import data as web
import yfinance as yf
yf.pdr_override()
filename=r'C:\Users\User\Desktop\from_python\data_from_python.xlsx'
yeah = pd.read_excel(filename, sheet_name='entry')
stock = []
stock = list(yeah['name'])
stock = [ s.replace('\xa0', '') for s in stock if not pd.isna(s) ]
adj_close=pd.DataFrame([])
high_price=pd.DataFrame([])
low_price=pd.DataFrame([])
volume=pd.DataFrame([])
print(stock)
['^GSPC', 'NQ=F', 'AAU', 'ALB', 'AOS', 'APPS', 'AQB', 'ASPN', 'ATHM', 'AZRE', 'BCYC', 'BGNE', 'CAT', 'CC', 'CLAR', 'CLCT', 'CMBM', 'CMT', 'CRDF', 'CYD', 'DE', 'DKNG', 'EARN', 'EMN', 'FBIO', 'FBRX', 'FCX', 'FLXS', 'FMC', 'FMCI', 'GME', 'GRVY', 'HAIN', 'HBM', 'HIBB', 'IEX', 'IOR', 'KFS', 'MAXR', 'MPX', 'MRTX', 'NSTG', 'NVCR', 'NVO', 'OESX', 'PENN', 'PLL', 'PRTK', 'RDY', 'REGI', 'REKR', 'SBE', 'SQM', 'TCON', 'TCS', 'TGB', 'TPTX', 'TRIL', 'UEC', 'VCEL', 'VOXX', 'WIT', 'WKHS', 'XNCR']
for symbol in stock:
adj_close[symbol] = web.get_data_yahoo([symbol],start,end)['Adj Close']
I have a list of tickers, I have got the adj close price, how can get these tickers NAME and SECTORS?
for single ticker I found in web, it can be done like as below
sbux = yf.Ticker("SBUX")
tlry = yf.Ticker("TLRY")
print(sbux.info['sector'])
print(tlry.info['sector'])
How can I make it as a dataframe that I can put the data into excel as I am doing for adj price.
Thanks a lot!
You can try this answer using a package called yahooquery. Disclaimer: I am the author of the package.
from yahooquery import Ticker
import pandas as pd
symbols = ['^GSPC', 'NQ=F', 'AAU', 'ALB', 'AOS', 'APPS', 'AQB', 'ASPN', 'ATHM', 'AZRE', 'BCYC', 'BGNE', 'CAT', 'CC', 'CLAR', 'CLCT', 'CMBM', 'CMT', 'CRDF', 'CYD', 'DE', 'DKNG', 'EARN', 'EMN', 'FBIO', 'FBRX', 'FCX', 'FLXS', 'FMC', 'FMCI', 'GME', 'GRVY', 'HAIN', 'HBM', 'HIBB', 'IEX', 'IOR', 'KFS', 'MAXR', 'MPX', 'MRTX', 'NSTG', 'NVCR', 'NVO', 'OESX', 'PENN', 'PLL', 'PRTK', 'RDY', 'REGI', 'REKR', 'SBE', 'SQM', 'TCON', 'TCS', 'TGB', 'TPTX', 'TRIL', 'UEC', 'VCEL', 'VOXX', 'WIT', 'WKHS', 'XNCR']
# Create Ticker instance, passing symbols as first argument
# Optional asynchronous argument allows for asynchronous requests
tickers = Ticker(symbols, asynchronous=True)
data = tickers.get_modules("summaryProfile quoteType")
df = pd.DataFrame.from_dict(data).T
# flatten dicts within each column, creating new dataframes
dataframes = [pd.json_normalize([x for x in df[module] if isinstance(x, dict)]) for module in ['summaryProfile', 'quoteType']]
# concat dataframes from previous step
df = pd.concat(dataframes, axis=1)
# View columns
df.columns
Index(['address1', 'address2', 'city', 'state', 'zip', 'country', 'phone',
'fax', 'website', 'industry', 'sector', 'longBusinessSummary',
'fullTimeEmployees', 'companyOfficers', 'maxAge', 'exchange',
'quoteType', 'symbol', 'underlyingSymbol', 'shortName', 'longName',
'firstTradeDateEpochUtc', 'timeZoneFullName', 'timeZoneShortName',
'uuid', 'messageBoardId', 'gmtOffSetMilliseconds', 'maxAge'],
dtype='object')
# Data you're looking for
df[['symbol', 'shortName', 'sector']].head(10)
symbol shortName sector
0 NQZ20.CME Nasdaq 100 Dec 20 NaN
1 ALB Albemarle Corporation Basic Materials
2 AOS A.O. Smith Corporation Industrials
3 ASPN Aspen Aerogels, Inc. Industrials
4 AAU Almaden Minerals, Ltd. Basic Materials
5 ^GSPC S&P 500 NaN
6 ATHM Autohome Inc. Communication Services
7 AQB AquaBounty Technologies, Inc. Consumer Defensive
8 APPS Digital Turbine, Inc. Technology
9 BCYC Bicycle Therapeutics plc Healthcare
It processes stocks and sectors at the same time. However, some stocks do not have a sector, so an error countermeasure is added.
Since the issue column name consists of sector and issue name, we change it to a hierarchical column and update the retrieved data frame. Finally, I save it in CSV format to import it into Excel. I've only tried some of the stocks due to the large number of stocks, so there may be some issues.
import datetime
import pandas as pd
import yfinance as yf
import pandas_datareader.data as web
yf.pdr_override()
start = "2018-01-01"
end = "2019-01-01"
# symbol = ['^GSPC', 'NQ=F', 'AAU', 'ALB', 'AOS', 'APPS', 'AQB', 'ASPN', 'ATHM', 'AZRE', 'BCYC', 'BGNE', 'CAT',
#'CC', 'CLAR', 'CLCT', 'CMBM', 'CMT', 'CRDF', 'CYD', 'DE', 'DKNG', 'EARN', 'EMN', 'FBIO', 'FBRX', 'FCX', 'FLXS',
#'FMC', 'FMCI', 'GME', 'GRVY', 'HAIN', 'HBM', 'HIBB', 'IEX', 'IOR', 'KFS', 'MAXR', 'MPX', 'MRTX', 'NSTG', 'NVCR',
#'NVO', 'OESX', 'PENN', 'PLL', 'PRTK', 'RDY', 'REGI', 'REKR', 'SBE', 'SQM', 'TCON', 'TCS', 'TGB', 'TPTX', 'TRIL',
#'UEC', 'VCEL', 'VOXX', 'WIT', 'WKHS', 'XNCR']
stock = ['^GSPC', 'NQ=F', 'AAU', 'ALB', 'AOS', 'APPS']
adj_close = pd.DataFrame([])
for symbol in stock:
try:
sector = yf.Ticker(symbol).info['sector']
name = yf.Ticker(symbol).info['shortName']
except:
sector = 'None'
name = 'None'
adj_close[sector, symbol] = web.get_data_yahoo(symbol, start=start, end=end)['Adj Close']
idx = pd.MultiIndex.from_tuples(adj_close.columns)
adj_close.columns = idx
adj_close.head()
None Basic Materials Industrials Technology
^GSPC_None NQ=F_None AAU_None ALB_Albemarle Corporation AOS_A.O. Smith Corporation APPS_Digital Turbine, Inc.
2018-01-02 2695.810059 6514.75 1.03 125.321663 58.657742 1.79
2018-01-03 2713.060059 6584.50 1.00 125.569397 59.010468 1.87
2018-01-04 2723.989990 6603.50 0.98 124.073502 59.286930 1.86
2018-01-05 2743.149902 6667.75 1.00 125.502716 60.049587 1.96
2018-01-08 2747.709961 6688.00 0.95 130.962250 60.335583 1.96
# for excel
adj_close.to_csv('stock.csv', sep=',')

Categories