Got an interesting error that could very easily be user problem.
I'm trying to access an API used by the Washington Public Disclosure Commission. It uses Socrata and because I'm using Python I'm using the sodapy package.
The error first:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='https', port=443): Max retries exceeded with url: //data.wa.gov/resource/dgis-xpmb.json?%24limit=2000 (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f9da657d110>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Here's the code, which is almost a carbon copy of the Code Snippet listed on the docs.
import pandas as pd
from sodapy import Socrata
MyAppToken = "###############"
client = Socrata("https://data.wa.gov",
MyAppToken,
username="###############.###",
password="##############")
results = client.get("dgis-xpmb", limit=2000)
results_df = pd.DataFrame.from_records(result_list)
In my own experimentation I've taken off the bottom line but the error stays the same, because the error gets caught at the client.get command.
I know it's not the server's problem because using curl -i, Chrome and Postman can all get the pertinent data no problem.
The token I'm using is also registered with Socrata so that should feasibly be OK.
I've tried running the python in Python 2.7.6 and 3.4.3 with the required packages. Still no go.
Any help appreciated.
Related
I am using the response module of python to download quite some data but I am facing an issue. Before posting my specific case, I have tried with a simpler code to test the issue and I face the same problem. Basically, when running this:
import request
x = requests.get('https://w3schools.com')
print(x.status_code)
following is the error I am facing.
ConnectionError: HTTPSConnectionPool(host='w3schools.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000002240C195760>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed'))
Could anyone guide me on this. I suspect its an issue with my network configuration? I have no experience with network management so I am lost on that.
Thanks
Well, I just found out it is indeed an error with my network, basically missing the proxy info as I am working behind one.
For those looking at this in the future, add your proxy settings manually:
import requests
proxies = {'http': 'http://your.proxy.com:8080',
'https': 'http://your.proxy.com:8080'}
url = 'https://w3schools.com'
response = requests.post(url, proxies=proxies)
print(response.status_code)
If you get 200 then it is working.
I'm trying to get the cluster configs using the databricks Clusters API when I run a notebook in a "jobs workflow" on Azure Databricks.
Can anyone advise on the best approach for this please?
The approaches recommended here:
Azure Databricks python command to show current cluster config
How to call Cluster API and start cluster from within
Databricks
Notebook?
...work fine when running notebooks "locally" (outside of jobs/workflows), but does not work during jobs/workflows
The first issue that I noticed when running a notebook in jobs/workflows was that "browserHostName" does not exist in the "tags". Instead there is "hostName" (EDIT - do not use "hostName" to resolve this issue, see below):
import requests
jsonNotebookConfigs = dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson()
dictNotebookConfigs = json.loads(jsonNotebookConfigs)
try:
strHostName = dictNotebookConfigs['tags']['browserHostName']
except:
strHostName = dictNotebookConfigs['tags']['hostName']
strHostToken = dictNotebookConfigs['extraContext']['api_token']
strClusterId = dictNotebookConfigs['tags']['clusterId']
strUrl = f'https://{strHostName}/api/2.0/clusters/get?cluster_id={strClusterId}'
dictHeader = {'Authorization': f'Bearer {strHostToken}'}
response = requests.get(strUrl,
headers = dictHeader
)
dictResponse = response.json()
The second issue is that when running the code above, I get the following error(s):
ConnectionRefusedError: [Errno 111] Connection refused
...followed by...
ConnectionError: HTTPSConnectionPool(host='xxx', port=443): Max retries exceeded with url: /api/2.0/clusters/get?cluster_id=xxx (Caused by
...followed by...
NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa32582d970>: Failed to establish a new connection: [Errno 111] Connection refused'))
...followed by...
MaxRetryError: HTTPSConnectionPool(host='xxx', port=443): Max retries exceeded with url: /api/2.0/clusters/get?cluster_id=xxx (Caused by
...etc...
Any help to resolve this would be much appreciated, thanks
Edit 2022-10-17
I have found a resolution (the error derived from the first issue that I identified):
The API requires that strHostName should be the workspace instance, i.e.
strHostName = 'adb-{WorkspaceId}.{RandomNumber}.azuredatabricks.net'
It turns out that "hostName" does not actually return the workspace instance during jobs/workflows (which causes the second issue).
Instead, you need to use "orgId" which maps to the WorkspaceId, and hard-code the RandomNumber (I do not see any alternative to this). This will fix the issues.
i have a program and i want to show the current weather information in the corner. The code works at my own computer but at my work notebook i cant establish connection and i get this error:
requests.exceptions.ProxyError:
HTTPSConnectionPool(host='api.openweathermap.org', port=443): Max
retries exceeded with url:
/data/2.5/weather%5Bhttps://api.openweathermap.org/data/2.5/weather%5D?APPID=abc123
d88&q=Frankfurt&units=metric (Caused by ProxyError('Cannot connect to
proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection
object at 0x000002563074FB48>: Failed to establish a new connection:
[WinError 10060]
I found this as a solution and i sounds promising:
Using an HTTP Proxy
Unforutnately i dont get it how i need to implement it because i dont have knowledge about the internet settings of the company. Something similar to this:
response = requests.get(url, params=params, proxies={"http": "12.34.56.78:1234", "https": "12.34.56.78:1234"})
what i do know is that i need to set this in PyCharm's Terminal to use pip install, and it seems related:
set HTTP_PROXY=12.34.56.78:1234
set HTTPS_PROXY=12.34.56.78:1234
I dont understand much about network settings but with that it works. Would i have to do this for the program as well? The port of the error message (443) is not matching to the port i enter above (1234).
Can you help me here? Would be much apreciated! :-)
I solved it:
i defined this globally:
proxies = {'http':'http://12.34.45.67:1234','https':'http://12.34.45.67:1234'}
And made my request like this:
response = requests.get(url, params=params, proxies=proxies)
I had some typos in the code before, which didnt help also ...
I am retrieving metadata of Crossref using crossref rest API
I have CSV file of DOIs from which I fetch DOI using python and make API call for each DOI to retrieve metadata from Crossref. I have to fetch metadata for many DOIs but after retrieving some metadata it gives connection error
import requests
response = requests.get("https://api.crossref.org/v1/works/http://dx.doi.org/" + CitedDOI[X])
This is connection error
HTTPSConnectionPool(host='api.crossref.org', port=443): Max retries exceeded with url: /v1/works/http://dx.doi.org/10.1080/10426910802104344 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x0000019510E068D0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))
Try
import requests
import json
response = requests.get("https://api.crossref.org/works/YOUR_DOI")
print(response.json())
I also recommend looking at the CrossRef API documentation. The rules change every now and then, but right now you can get faster speeds by
Including your email address in a header
Requesting multiple DOIs in a single call
Multithreading requests (limited to 50 requests per-second at the time of writing)
I am performing a REST post request using python code. I am trying to communicate with an external server. I am performing several REST calls but a specific one fails. That specific request is the following:
r = requests.post(url, json=item[1], headers=headers)
I am getting the following error:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='....euapi', port= ...):
Max retries exceeded with url: ...(Caused by NewConnectionError
('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at: ...
Failed to establish a new connection: [Errno 11004] getaddrinfo failed',))
It seems that the server is blocking the connection to me because of the number of connction with it. However, I just perform only one call. Does that make sense? Any idea of how can I overcome this issue?
It seems that your 'url' value is not correct (address cannot be resolved). Please check it carefully
Similar question here