I am trying to pull data from Yahoo Finance via Pandas. I have used similar pulls before, but haven't faced any issue before this
import pandas as pd
import numpy as np
import datetime as dt
from dateutil import parser
from pandas_datareader import data
from dateutil.relativedelta import relativedelta
end_date=dt.datetime.today()
begdate = end_date + relativedelta(years=-10)
data1 = data.get_data_yahoo('^DJI',begdate,end_date,interval='m')
This is the error I am getting
RemoteDataError: Unable to read URL: http://ichart.finance.yahoo.com/table.csv
I am using Python 3.5
EDIT:
This issue has been fixed as of v0.5.0 of pandas-reader. The fix below no longer applies.
As pointed out by others, the API endpoint has changed and a patch has been made but hasn't been merged to the master branch of pandas-datareader yet (as of 2017-05-21 6:19 UTC). The fix is at this branch by Rob Kimball (Issue | PR). For a temporary fix (until the patch is merged into master), try:
$ pip install git+https://github.com/rgkimball/pandas-datareader#fix-yahoo --upgrade
Or, in case you want to tweak the source code:
$ git clone https://github.com/rgkimball/pandas-datareader
$ cd pandas-datareader
$ git checkout fix-yahoo
$ pip install -e .
On Python:
import pandas_datareader as pdr
print(pdr.__version__) # Make sure it is '0.4.1'.
pdr.get_data_yahoo('^DJI')
Related
Here is the code to reproduce it:
import pandas as pd
url = 'https://info.gesundheitsministerium.gv.at/data/timeline-faelle-bundeslaender.csv'
df = pd.read_csv(url)
it fails with the following traceback:
URLError: <urlopen error [SSL: DH_KEY_TOO_SMALL] dh key too small (_ssl.c:1129)>
Here is a link to check the url. The download works from the same browser, if you embed the link on a markdown cell in Jupyter.
Any ideas to make this "just work"?
Update:
as per the question suggested by Florin C. below,
This solution solves the issue when downloading via requests:
import requests
import urllib3
requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS = 'ALL:#SECLEVEL=1'
requests.get(url)
It would be just a matter of forcing Pandas to to the same, somehow.
My Environment:
Python implementation: CPython
Python version : 3.9.7
IPython version : 7.28.0
requests : 2.25.1
seaborn : 0.11.2
json : 2.0.9
numpy : 1.20.3
plotly : 5.4.0
matplotlib: 3.5.0
lightgbm : 3.3.1
pandas : 1.3.4
Watermark: 2.2.0
Here is a solution if you need to automate the process and don't want to have to download the csv first and then read from file.
import requests
import urllib3
import io
requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS = 'ALL:#SECLEVEL=1'
res = requests.get(url)
pd.read_csv(io.BytesIO(res.content), sep=';')
It should be noted that it may not be safe to change the default cyphers to SECLEVEL=1 at the OS level. But this temporary change should be ok.
I am trying to pull finance data from online using pandas and pandas-datareader and a key file in the program seems to be missing.
I have tried creating the file and continuing but I have no idea what this code is supposed to do so that did not work. I have tried uninstalling pandas, pandas-datareadr, in conda install and pip install. I have tried re-installing all of ananconda and visual studio code to no avail.
import datetime as dt
now = dt.datetime.now()
import matplotlib.pyplot as plt
from matplotlib import style
import pandas as pd
import pandas_datareader.data as web
style.use('ggplot')
start = dt.datetime(2010,1,1)
end = dt.datetime.now
df = web.DataReader('IIPR', 'yahoo', start, end)
print(df.head)
ERROR
(base) C:\Users\Chris\Desktop\Python>cd c:\Users\Chris\Desktop\Python && cmd /C "set "PYTHONIOENCODING=UTF-8" && set "PYTHONUNBUFFERED=1" && C:\Users\Chris\Anaconda3\python.exe c:\Users\Chris.vscode\extensions\ms-python.python-2018.12.1\pythonFiles\ptvsd_launcher.py
--default --client --host localhost --port 55519 c:\Users\Chris\Desktop\Python\FinanceBasics.py "
Backend Qt5Agg is interactive backend. Turning interactive mode on.
Unable to open 'tslib.pyx': File not found (file:///c:/users/chris/desktop/python/pandas/_libs/tslib.pyx).
I expect the code to create a data frame with finance data pulled from yahoo (I also understand that there are some problems with yahoo finance API and others but I have not been able to arrive at the step in my code).
Overview
from elasticsearch import Elasticsearch does not work.
import elasticsearch
e = elasticsearch.Elasticsearch(...)
does work.
Deets
I am trying to use a simple Elasticsearch client in python using AWS (ssh'd on an Amazon linux e3 machine). The code I am copying is here. I am unable to import the Elasticsearch class as described in the guide.
Using from elasticsearch import Elasticsearch gives me the error: ImportError: cannot import name 'Elasticsearch'.
I opened the python3 cli to check it out. If I type from elasticsearch import E and tab-complete, I get the following suggestions: EOFError( Ellipsis EnvironmentError( Exception(. However from elasticsearch import Ellipsis gives me ImportError: cannot import name 'Ellipsis'.
If I type import elasticsearch, then on the next line elasticsearch. and hit tab to autocomplete, I get the full range that I would expect (Elasticsearch(, RequestsHttpConnection(, etc.).
I assume that this has something to do with the way/where it is installed.
I used pip3 install elasticsearch --user to install it originally. I uninstalled it (pip3 uninstall elasticsearch) and returned to the python cli. from elasticsearch import E still gives me EOFError( Ellipsis EnvironmentError( Exception( on the tab-complete, but from elasticsearch import Ellipsis now returns ModuleNotFoundError: No module named 'elasticsearch', as does just import elasticsearch.
Not really quite sure what is up. I did not tag this as elasticsearch because it might be a user error :P
which python3: /usr/bin/python3
which pip3: ~/.local/bin/pip3
pip3 --version: pip 18.1 from /home/ec2-user/.local/lib/python3.6/site-packages/pip (python 3.6)
My problem was that I had named my file the same thing as the module I was trying to import from - elasticsearch.py. As user2357112 states, I became hung up on the incorrect autocomplete.
I tried to import the excel file with the following code and the size file is about 71 Mb and when runing the code, it shows "IndexError: pop from empty stack". Thus, kindly help me with this.
Code:
import pandas as pd
df1 = pd.read_excel('F:/Test PCA/Week-7-MachineLearning/weather.xlsx',
sheetname='Sheet1', header=0)
Data: https://www.dropbox.com/s/zyrry53li55hvha/weather.xlsx?dl=0
Using the latest pandas and xlrd this works fine to read the "weather.xlsx" file you provided:
df1 = pd.read_excel('weather.xlsx',sheet_name='Sheet1')
Can you try running:
pip install --upgrade pandas
pip install --upgrade xlrd
To ensure you have the latest version of the modules for reading the file?
i tried with same code provided by you with below versions of pandas and xlrd and it is working fine just changed sheetname argument to sheet_name
pandas==0.22.0
xlrd==1.1.0
df=pd.read_excel('weather.xlsx',sheet_name='Sheet1',header=0)
I'm writing a lambda function that works with datetimes and trying to import pytz so I can have timezone be accounted for when comparing.
import boto3
import pytz
from datetime import timedelta, date, datetime
from boto3.dynamodb.conditions import Key, Attr
causes this to display
{errorMessage=Unable to import module 'lambda_function'}
but when I remove import pytz the function fires (it just doesn't work properly without timezone info)
If you don't have access to pytz in your environment, maybe you have access to python-dateutil. In that case you can do:
import datetime
import dateutil.tz
eastern = dateutil.tz.gettz('US/Eastern')
datetime.datetime.now(tz=eastern)
REF. How to get current time in Pacific Timezone when import pytz fails?
You need to install the pytz package so it's available for your lambda. The way you do this is having pip install it into the directory you are going to zip and upload to AWS (i.e. peered with the file containing your lambda function).
pip install -t path/to/your/lambda pytz
Then when you zip it up and upload it, it will be available.
Editing to add that I created a tool to do a lot of this for you - you can find it here: https://github.com/jimjkelly/lambda-deploy
To follow up on #cheframzi's answer to "Package a pytz zip file in the format python/pytz/..." as a Lambda Layer, here is one way to do that.
mkdir python
pip3 install -t python pytz=='2019.2'
zip -r pytz.zip python
rm -rf python
And then you can use aws lambda publish-layer-version --layer-name <layer_name> --zip-file fileb://./pytz.zip to deploy a new version of the layer.
As long as the library is installed at the python/pytz level of the zip file, AWS Lambda should be able to find it. You can also put it inside python/lib/python3.8/site-packages\pytz though for your specific python runtime version per here: https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html
I ran into this issue today. The way I solved is
Package a pytz zip file in the format python/pytz/... the library file
Created a Lambda Layer
In my lambda used the above layer
I've spent few hours for this pytz issue.
With AWS you can try using the gettz method from 'dateutil.tz'.
This way you can get the desired result that you were getting with pytz.
In my case I required isoformat time (in utc with (+00:00) timezone).
from datetime import datetime
from dateutil.tz import gettz
datetime.now(gettz('UTC')).isoformat() # same result as datetime.now(pytz.utc).isoformat()
You can also add public ARNs as Lambda layers, I used this
https://dev.to/vumdao/create-cron-jobs-on-aws-lambda-with-cloudwatch-event-3e07
import datetime as dt
import dateutil.tz
aesttime = dateutil.tz.gettz('Australia/Brisbane')
print(dt.datetime.now(tz=aesttime).strftime(format="%Y-%m-%d %H:%M:%S"))