beginner question coming up and cant seem to connect the dots.
I have a portfolio data frame called my_pf which includes the tickers that I use for collecting the opening price. I success in collecting the opening data via the next two steps.
#create a list from the column 'ticker'
my_tickers = my_pf['ticker'].tolist()
#collect the opening data per ticker
for ticker in my_tickers:
open_price = yf.Ticker(ticker).info.get('open')
print(ticker, open_price)
The next step is adding the extracted data to my initial data frame. But how would i go about this?
Thank you for your help in advance.
There are many ways to add data to a column, such as df.append() and pd.concat(), but we created our code with df.append(). We start with an empty data frame to create the stock column and the opening price column. Once we have the opening price, we add the brand name and opening price to the data frame we just created.
import pandas as pd
import yfinance as yf
# my_tickers = my_pf['ticker'].tolist()
my_tickers = ['msft', 'aapl', 'goog']
tickers = yf.Tickers(my_tickers)
df = pd.DataFrame(index=[], columns=['ticker','Open'])
for ticker in my_tickers:
open_price = yf.Ticker(ticker).info.get('open')
df = df.append(pd.Series([ticker,open_price], index=df.columns), ignore_index=True)
print(df)
ticker Open
0 msft 204.07
1 aapl 112.37
2 goog 1522.36
Related
I'm building a script to pull data from Yahoo finance on multiple different companies and have run into an issue in my loop:
import pandas as pd
import yfinance as yf
import datetime
import time
companies = ['AAPL', 'MSFT', 'AMZN', 'PYPL']
xlwriter = pd.ExcelWriter('marketcap.xlsx', engine='openpyxl')
company_metrics = {}
for company in companies:
company_metrics[company] = {}
company_info = yf.Ticker(company)
company_metrics[company]['Market Cap'] = company_info.info['marketCap']
df = pd.DataFrame.from_dict(company_metrics)
df.to_excel(xlwriter, sheet_name=company, index=False)
xlwriter.save()
The data pulls fine, but when continuing through the loop, each new company data gets added to the lasts until the last sheet is the new data including all prior data... I would like all my sheets to only include their own data on their own individual sheets.
Thank you for your time to read my post!
Try changing write line to:
df[company].to_excel(xlwriter, sheet_name=company, index=False)
I'm new to python and I want to get the historical data of some stocks. I'm trying to use investpy, but it seems that it can only get one stock at a time.
Is this correct?
If so, how can I merge those single data into one dataframe?
I tried to run something like this, but got only one column (and without the company's name). yfinance doesn't work in my case.
import investpy as inv
stocks = ["WEGE3", "JHSF3"]
dfs = list()
for stock in stocks:
df = inv.get_stock_historical_data(stock=stock, country="Brazil", from_date="01/01/2020", to_date="01/01/2021")["Close"]
dfs.append(df)
import investpy as inv
import pandas as pd
stocks = ["WEGE3", "JHSF3"]
dfs = pd.DataFrame()
for stock in stocks:
df = inv.get_stock_historical_data(stock=stock, country="Brazil", from_date="01/01/2020", to_date="01/01/2021")["Close"]
dfs = dfs.append(df)
dfs = dfs.T
dfs.columns = stocks
dfs.head()
So I am gathering data from the S&P 500,from a csv file. My question is how would I create one large dataframe, that has 500 columns and with all of the prices. The code is currently:
import pandas as pd
import pandas_datareader as web
import datetime as dt
from datetime import date
import numpy as np
def get_data():
start = dt.datetime(2020, 5, 30)
end = dt.datetime.now()
csv_file = pd.read_csv(os.path.expanduser("/Users/benitocano/Downloads/copyOfSandP500.csv"), delimiter = ',')
tickers = pd.read_csv("/Users/benitocano/Downloads/copyOfSandP500.csv", delimiter=',', names = ['Symbol', 'Name', 'Sector'])
for i in tickers['Symbol'][:5]:
df = web.DataReader(i, 'yahoo', start, end)
df.drop(['High', 'Low', 'Open', 'Close', 'Volume'], axis=1, inplace=True)
get_data()
So as the code shows right now it is just going yo create 500 individual dataframes, and so I am asking how to make it into one large dataframe. Thanks!
EDIT:
The CSV file link is:
https://datahub.io/core/s-and-p-500-companies
I have tried this to the above code:
for stock in data:
series = pd.Series(stock['Adj Close'])
df = pd.DataFrame()
df[ticker] = series
print(df)
Though the output is only one column like so:
ADM
Date
2020-06-01 38.574604
2020-06-02 39.348278
2020-06-03 40.181465
2020-06-04 40.806358
2020-06-05 42.175167
... ...
2020-11-05 47.910000
2020-11-06 48.270000
2020-11-09 49.290001
2020-11-10 50.150002
2020-11-11 50.090000
Why is printing only one column, rather than the rest if them?
The answer depends on the structure of the dataframes that your current code produces. As the code depends on some files on your local drive, we cannot run it so hard to be specific here. In general, there are many options, among the most common I would say are
Put dfs into a list and use pandas.concat(..., axis=1) on that list to concatenate dfs column by column, see here
Merge (merge or join) your dfs on the Date column that I assume each df has, see here
so basically I have downloaded multiple stocks data in and stored in CSV format so I created a function to that and passed a list of stocks name to that user-defined function .so one stock data have multiple columns in like open price, close price etc so I want close price column from every stock df stored in a new data frame with stock names as heading to the columns in new data frame with their close prices in it
so I created a function to download multiple stocks data and passed a list of stocks names to get data I wanted and the function stores them in a CSV format
2) then I tried creating a for loop which reads each and every stock data CSV file and tries to pick only close column from each stock dataframe and store it another empty data frame so i have a data frame of-the stocks close prices with their column header as stock name of the close prices so i was succesful in dowloading the stocks data but failed in 2 part
stocks = ['MSFT','IBM', 'GM', 'ACN', 'GOOG']
end=datetime.datetime.now().date()
start=end-pd.Timedelta(days=365*5)
def hist_data(stocks):
stock_df=web.DataReader(stocks,'iex',start,end)
stock_df['Name']=stocks
fileName=stocks+'_data.csv'
stock_df.to_csv(fileName)
with futures.ThreadPoolExecutor(len(stocks)) as executor:
result=executor.map(dwnld_data,stocks)
print('completed')
#failing in the code below
close_prices = pd.DataFrame()
for i in stocks:
df = pd.read_csv(i + '_data.csv')
df1 = df['close']
close_prices.append(df1)
#so when I try to print close_prices I get blank output
Try the following:
close_prices = pd.DataFrame()
for i in stocks:
df = pd.read_csv(i + '_data.csv')
close_prices[i] = df['close']
I am trying to pull data from Yahoo! Finance for analysis and am having trouble when I want to read from a CSV file instead of downloading from Yahoo! every time I run the program.
import pandas_datareader as pdr
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import datetime
def get(tickers, startdate, enddate):
def data(ticker):
return pdr.get_data_yahoo(ticker, start = startdate, end = enddate)
datas = map(data, tickers)
return(pd.concat(datas, keys = tickers, names = ['Ticker', 'Date']))
tickers = ['AAPL', 'MSFT', 'GOOG']
all_data = get(tickers, datetime.datetime(2006, 10,1), datetime.datetime(2018, 1, 7))
all_data.to_csv('data/alldata.csv')
#Open file
all_data_csv = pd.read_csv('data/alldata.csv', header = 0, index_col = 'Date', parse_dates = True)
daily_close = all_data[['Adj Close']].reset_index().pivot('Date', 'Ticker', 'Adj Close')
I'm having problems with the 'daily_close' section. The above code works as it is using 'all_data' which comes directly from the web. How do I alter the bottom line of code so that the data is being pulled from my csv file? I have tried daily_close = all_data_csv[['Adj Close']].reset_index().pivot('Date', 'Ticker', 'Adj Close') however this results in a KeyError due to 'Ticker'.
The csv data is in the following format, with the first column containing all of the tickers:
Your current code for all_data_csv will not work as it did for all_data. This is a consequence of the fact that all_data contains a MultiIndex with all the information needed to carry out the pivot.
However, in the case of all_data_csv, the only index is Date. So, we'd need to do a little extra in order to get this to work.
First, reset the Date index
Select only the columns you need - ['Date', 'Ticker', 'Adj Close']
Now, pivot on these columns
c = ['Date', 'Ticker', 'Adj Close']
daily_close = all_data_csv.reset_index('Date')[c].pivot(*c)
daily_close.head()
Ticker AAPL GOOG MSFT
Date
2006-10-02 9.586717 199.422943 20.971155
2006-10-03 9.486828 200.714539 20.978823
2006-10-04 9.653308 206.506866 21.415722
2006-10-05 9.582876 204.574448 21.400393
2006-10-06 9.504756 208.891357 21.362070