Unable to Import File using h2o in Python

Unable to Import File using h2o in Python - python

I'm trying to import a file using h2o in Python.
h2o.init() is successful, but when I do the following:
df = h2o.import_file(path = "Combined Database - Final.csv")
I get a number of errors that I can't find any help on. Here is the last one that shows up:
H2OConnectionError: Unexpected HTTP error:
HTTPConnectionPool(host='127.0.0.1', port=54321): Max retries exceeded
with url: /3/Jobs/
$03017f00000132d4ffffffff$_a6edaa906ba7a556a417c13149c940db (Caused by
NewConnectionError(': Failed to establish a new connection: [WinError
10048] Only one usage of each socket address (protocol/netw ork
address/port) is normally permitted',))
Above it, there are “OSError”, “NewConnectionError”, “MaxRetryError”.
This is my first time using h2o, and I can't even import my data. Any help would be much appreciated!

please see the user guide: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/starting-h2o.html
please also run the following tests (reposted from here) to debug your issue.
Does running h2o.jar from the commandline work?
And if so, does h2o.init() then connect to it?
What do the logs say?
Disable your firewall, and see if it makes a difference. (Remember to
re-enable it afterwards).
Try a different port number (the default is 54321).
Shutdown h2o (h2o.shutdown()) and try running h2o.init() and see if it works.

Related

Python: request.get for github doesn't work

I am trying to run a GET request for a Github url. Unfortunately, I always get an error message.
I tried it for several different websites and it works, just not for github.
I am trying to do it with Jupyter Notebooks in Python, if that is important.
Here is the Error message:
ConnectionError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /jana-hoh/gdp/main/DP_LIVE_22102021141534889.csv (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7a1c285d60>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

An error message that includes Temporary failure in name resolution indicates that the system's DNS server is unable to convert domain names into their corresponding IP address. Some of the causes are:
Your DNS configuration is correct, but the server is unable to respond to DNS requests at the moment
Firewall rules
No internet connectivity
Most of the times I've encountered this error stemmed from being disconnected to the internet. However, if your internet is working properly, you can try to add another DNS server in /etc/resolv.conf. For example, you can add cloudflare's:
nameserver 1.1.1.1

python salesforce not connecting , home okay, work not. Firewall / Security?

I registered my own salesforce developer login.
I am able to connect to this from my home computer and my work computer via the salesforce login url.
I am now writing python code to extract from salesforce. The code is below.
The code runs on my work laptop when I am at home and connected to my ISP.
When running the same code on my work laptop at work, (so now using work ISP), the code fails to connect.
The error I get when I run at work is:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='login.salesforce.com', port=443): Max retries exceeded with url: /services/Soap/u/40.0 (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))
So expect something going on with firewall and what not.
But am confused. It does not seem right that my work laptop is more "open" to outside when I use my ISP. I would have thought the security / firewall would have been implemented in a layer between work laptop and the ISP. So ISP agnostic. These laptops are meant to be used at home as well. I am not doing anything wrong in that respect.
Python code below.
import unicodecsv
from salesforce_bulk import SalesforceBulk
bulk = SalesforceBulk(password='**', username='**', security_token='**')
job = bulk.create_query_job("Contact", contentType='CSV')
batch = bulk.query(job, "select Id,LastName from Contact")
bulk.close_job(job)
while not bulk.is_batch_done(batch):
sleep(10)
for result in bulk.get_all_results_for_query_batch(batch):
reader = unicodecsv.DictReader(result, encoding='utf-8')
for row in reader:
print(row) # dictionary rows
Oops. Figured it out. Need to add proxies parameter when connecting to salesforce at work. Bit of a revelation. There is a level of security/protection that is missing on a work laptop when it used at home. Did not realise that networks / firewall / security worked in such a way.

ebaysdk-python Connection Error

I have a django project which is currently being hosted on pythonanywhere that uses the Finding api of the open source project ebaysdk-python. Now on my local machine, the site worked perfectly, however, when I execute the api call I get this error message: HTTPConnectionPool(host='svcs.ebay.com', port=80): Max retries exceeded with url: /services/search/FindingService/v1 (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f6560105150>: Failed to establish a new connection: [Errno 111] Connection refused',)).
Now I have scoured the docs and other related questions, but could not figure out the issue. I have verified that my API keys are correct, and my code to execute the api call is straight from the docs. So that I may be pointed in the correct direction: What is the most likely cause for this error to be raised under these circumstances?
Thank you.

Socrata SODA API is rejecting with Max Retries Exceeded

I am using the REST API Modular Input within Splunk to GET data.SFGov.org data via SODA API. I have an APP TOKEN. I am getting the MAX RETRIES EXCEEDED repeatedly.
Background:
I am building a proto Splunk based stream cursor for SF opendata. I have been testing a GET using the REST API MODULAR INPUT from the configuration screen itself, have not written any python code yet. Here is the ERROR.
11-30-2016 16:24:57.432 -0800 ERROR ExecProcessor - message from "python /Applications/Splunk/etc/apps/rest_ta/bin/rest.py" Exception performing request: HTTPSConnectionPool(host='data.sfgov.org', port=443): Max retries exceeded with url: [REDACTED] (Caused by : [Errno 8] nodename nor servname provided, or not known)
I found out that by mistake, the REST API module's polling interval was set to 60 seconds and it might have caused a problem? (I changed it to ONE DAY to avoid future issues). I then got a new APP TOKEN and tried a GET. I see the get going out in the log, but the same MAX RETRIES EXCEEDED error is coming. I am using the same IP address.
I will be testing for the next few weeks. How do I fix this and gracefully avoid this again?
#chrismetcalf - just flagging you.

Max Retries Exceeded is not an error message that I'd expect to see out of our API, especially if you were only making a call every 60 seconds. I think that may actually be Splunk giving up after trying and failing to make your HTTP call too many times.
The error message Caused by : [Errno 8] nodename nor servname provided, or not known makes me think that there's actually a DNS error on Splunk's side. That's the error message you usually see when a domain name can't be resolved.
Perhaps there's some DNS whitelisting you need to make in your Splunk environment?

sudden 'socket.gaierror': [Errno -2] Name or service not known

I am having the following error:
(Url and api are made up in this example).
(, ConnectionError(MaxRetryError("HTTPConnectionPool(host='urlICantDisplay.com', port=80): Max retries exceeded with url: /some_api/user_id/action_name (Caused by : [Errno -2] Name or service not known)",),), )
I use the same api for many users, but suddenly I start getting this error,
and from then I keep getting the error until I restart the process.
I've read this might be a congestion problem:
Random "[Errno -2] Name or service not known" errors
and pausing between calls might help, but this is a real time application that should not pause.
and also i would have presumed that the api would start working after a while,
I used the api again in the same process after 7 hours and still got the error.
I also read this is a dns error, but as i've said, dns works then suddenly stops working altogether.
Only restarting the process solved it.
I thought about saving the ip of the dns to stop asking the dns server to do it.
But i'm not sure if it will work or even connected.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Unable to Import File using h2o in Python - python

Related

Python: request.get for github doesn't work

python salesforce not connecting , home okay, work not. Firewall / Security?

ebaysdk-python Connection Error

Socrata SODA API is rejecting with Max Retries Exceeded

sudden 'socket.gaierror': [Errno -2] Name or service not known

Categories

Resources