How to retrieve company Elasticsearch data from Python? - python

I am very new to Elasticsearch and want to analyze data in python.
I installed Elasticsearch pip and tried to import data but failed with error messages
es = Elasticsearch([{'hosts':'10.251.0.135', 'port':'5601'}])
es.info()
> ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x000001AD21943460>: Failed to establish a new connection: [WinError 10061] caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x000001AD21943460>: Failed to establish a new connection: [WinError 10061]
or
es = Elasticsearch("http://10.251.0.134:5601/")
es.info()
> TransportError: TransportError(302, '')
I looked out some solutions but they kinda assume that I have Elasticsearch in my local machine, which in my case isn't much helpful.
I don't think I am not authorized to access the data as I can access to data through web-hosting Kibana app. Hope to know what would be the problem.

Thanks to leandrojmp, I manage to find the answer.
My situation was:
At work, needed to retrieve Elasticsearch server data to python.
I was the only analyst and others see data through kibana(5601).
No Elasticsearch or Kibana installed on my local machine, so the advice like change configuration doesn't seems to match.
The error was as stated on the question
How I manage to figure out:
I went to port 9200 on the browser, which is direct access to Elasticsearch DB and find out that I only have access to port 5601, not 9200.
Asked Server manager to disable the firewall, and everything works find :)

Related

Exasol_Error: I keep getting Exasol connection error timed out

I am trying to connect to my Exasol SaaS database, I tried via these tools(TALEND, DBVISUALIZER, POWERBI) and via python but I cannot connect and I keep getting the same error.
I saw another post on Exasol community https://community.exasol.com/t5/discussion-forum/exaconnectionfailederror/m-p/8049#M1855 of this type of error but it doesn't explain exactly what was done to fix the error. I tried via the ODBC Data Source administrator(64-bit) too but still the same error. Maybe its an connection issue with my pc self but I'm not sure or maybe I am just inserting wrong values I don't know.
Oh the values I inserted are the recommended ones from what Exasol docs states and I have removed anything about proxy or vpn.
I put my errors under. I tried via different devices and I get the same error I really don't know what I can do any more, so any help will be greatly appreciated.
Note: I am using the Exasol SaaS database and I am currently on the trial mode so I am not sure if this is limiting me.
**Errors: **
Error message odbc exasol: [EXASOL][EXASolution driver]connection attempt timed out.
Error message Talend : Connection failure. You must change the Database Settings.
java.lang.RuntimeException: com.exasol.jdbc.ConnectFailed: connect timed out ->
Caused by: com.exasol.jdbc.ConnectFailed: connect timed out
Error message pyexasol : socket.timeout: timed out
Error message dbvisualizer : java.net.SocketTimeoutException: Connect timed out com.exasol.jdbc.ConnectFailed: java.net.SocketTimeoutException: Connect timed out
Error message Power BI desktop : Details: "ODBC: ERROR [HYT00][EXASOL][EXASolution driver]Connection attempt timed out."
My applications versions:
DbVisualizer Free 14.0.1 (build: 3540)
Talend Open Studio Data integration(8.0.1.2021119_1610)
java version -> jdk-16.0.02
Power BI -> Version: 2.110.1085.0 64-bit (October 2022)
ODBC : exasolodbc x64 7.1.14
JDBC : exasoljdbc 7.1.14
Python: python 3.8.10 -> pyexasol : 0.25.1
The error means that the client is not able to reach the host for some reason. Try the following:
Make sure the database is still online (they auto-shutdown after 2 hours if there is no activity by default)
Check that the IP Address of the host you are connecting with is added to the allow list in the SaaS UI. (see the docs)
Check if your host is able to reach the host and port specified in the SaaS UI (for example using telnet on port 8563). Maybe some firewall is preventing access to the database?
So I did more digging. actually I have no idea what the issue was.
Talend:
I made a connection via JDBC in Talend with the help of exasol-support. The DBType Exasol in talend doesn't work for some reason, its not known if it's talend side or Exasol side, maybe this will be updated in the future. Just remember in the jdbc url type this: "jdbc:exa:yourconnectionstring", don't forget the "exa".
PowerBI:
I tried the connection string with fingerprint method that worked for me. Just put the fingerprint with the connection string and it should connect.
https://exasol.my.site.com/s/article/PowerBI-Encryption-Fingerprint-Issue-in-Exasol-7-1?language=en_US
DBvisualizer:
I had a wrong in connection string.
Python:
I had a wrong in connection string.
Hopefully this helps someone.

GCP dataproc with presto - is there a way to run queries remotely via python using pyhive?

I am trying to run queries on a presto cluster I have running on dataproc - via python (using presto from pyhive) on my local machine. But I can't seem to figure out the host URL. Does GCP dataproc even allow accessing the presto clusters remotely?
I tried using the URL on Presto's web UI, but that didn't work either.
I also checked the docs about using Cloud Client Libraries for Python. Wasn't helpful either. https://cloud.google.com/dataproc/docs/tutorials/python-library-example
from pyhive import presto
query = '''select * FROM system.runtime.nodes'''
presto_conn = presto.Connection(host={host}, port=8060, username ={user})
presto_cursor = presto_conn.cursor()
presto_cursor.execute(query)
Error
ConnectionError: HTTPConnectionPool(host='https', port=80): Max retries exceeded with url: {url}
(Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb41c0c25d0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))
Update
I was able to manually create a VM on GCP compute, configure trino and setup firewall rules and load balancer to be able to access the cluster.
Gotta check if dataproc allows similar config.
Looks like Google firewall is blocking connections from the outside world.
How to fix
Quick and dirty solution
Just allow access to ports 8060 from your IP to the dataproc cluster.
This might not scale if you're on a public IP address but will allow you to develop.
It is a bad idea to expose "big data" services to the whole internet. You might get hacked, and Google will shut down the service.
Use a SSH tunnel
Create a small instance (one from the free-tier), expose the SSH port to the inernet, and use port-forwarding.
Your URLs won't be https://dataproc-cluster:8060..., but https://localhost:forwarded_port
This is easy to do and you can turn off that bastion vm when it's not needed.

python salesforce not connecting , home okay, work not. Firewall / Security?

I registered my own salesforce developer login.
I am able to connect to this from my home computer and my work computer via the salesforce login url.
I am now writing python code to extract from salesforce. The code is below.
The code runs on my work laptop when I am at home and connected to my ISP.
When running the same code on my work laptop at work, (so now using work ISP), the code fails to connect.
The error I get when I run at work is:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='login.salesforce.com', port=443): Max retries exceeded with url: /services/Soap/u/40.0 (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))
So expect something going on with firewall and what not.
But am confused. It does not seem right that my work laptop is more "open" to outside when I use my ISP. I would have thought the security / firewall would have been implemented in a layer between work laptop and the ISP. So ISP agnostic. These laptops are meant to be used at home as well. I am not doing anything wrong in that respect.
Python code below.
import unicodecsv
from salesforce_bulk import SalesforceBulk
bulk = SalesforceBulk(password='**', username='**', security_token='**')
job = bulk.create_query_job("Contact", contentType='CSV')
batch = bulk.query(job, "select Id,LastName from Contact")
bulk.close_job(job)
while not bulk.is_batch_done(batch):
sleep(10)
for result in bulk.get_all_results_for_query_batch(batch):
reader = unicodecsv.DictReader(result, encoding='utf-8')
for row in reader:
print(row) # dictionary rows
Oops. Figured it out. Need to add proxies parameter when connecting to salesforce at work. Bit of a revelation. There is a level of security/protection that is missing on a work laptop when it used at home. Did not realise that networks / firewall / security worked in such a way.

Python-Telegram-Bot not running behind firewall

I've been searching about this for several days to no avail. I have a python bot ( polling for updates) running OK at home or any public internet. However, when at work behind a firewall, the bot cannot connect to the server. I believe that the application must know the proxy server, user ID and password in order to proceed. But I cannot a find the way or how to include this info in the bot application. Below is the error message:
2017-03-13 07:13:44,233 - telegram.ext.updater - ERROR - Error
while getting Updates: urllib3 HTTPError HTTPSConnectionPool
(host='api.telegram.org', port=443):Max retries exceeded with
url: /botXXXXXXXXX:Token/getUpdates (Caused byNewConnectionError
('<urllib3.connection.VerifiedHTTPSConnection object at 0x031541F0>:
Failed to establish a new connection: [Errno 10061] No connection
could be made because the target machine actively refused it',))
/getUpdates (Caused by NewConnectionError ('
Other thing: The Telegram messenger application runs OK behind this same firewall without any info on the proxy server. So it can connect to the server with no problem. I mention this because another cause could be my company uses websense or something like that to block the telegram server , but it is not the case as the messenger application do work OK.
Thanks a million in advance for any hint.
UPDATE JULY 26th, 2017: The solution was as suggested by Sudheesh. The environmental variables https_proxy needs to be set. At the time of this answer it seems to be I've entered the wrong proxy server or in the wrong way. looking around internet , I've noticed the way to set this is (in windows) is:
set https_proxy=http://proxy_url
Notice the right side of the equal show http ( not https)
Thanks to Sudheesh again!

I cannot establish a connection to MonetDB using pymonetDB in Python

I'm trying to use MonetDB in Python to perform some tasks on a huge dataframe. I've installed the suggested API and it has successfully loaded. But when I try to establish the connection, I always get the same error:
[Errno 10061] No connection could be made because the target machine actively refused it
In my research, people have said that I need to set the same port between client and server. But I don't know what this means or how to do it. I've tryed many variations, but the basic structure I've been using is the one provided in the documentation:
connection = pymonetdb.connect(database = 'Main_Database', username="monetdb", password="monetdb", hostname="localhost")
I'd like to get to the point where my CSV dataframe is fully loaded and I can start doing operations with it.

Categories