establish redis connection in python - python

I am using redisbayes library in python to implement naive bayes classification. But when I write -
rb = redisbayes.RedisBayes(redis=redis.Redis())
rb.train('good', 'sunshine drugs love sex lobster sloth')
It gives the following error -
ConnectionError: Error 10061 connecting localhost:6379.
No connection could be made because the target machine actively refused it.
I tried doing it this way -
pool = redis.ConnectionPool(host='localhost', port=6379, db=0)
rb = redisbayes.RedisBayes(redis=redis.Redis(connection_pool=pool))
But it gives the same error. I am not being able to find a solution to this. How can I establish a connection to redis using python, or this any other way to do naive bayes classification in python using training data from MySQL?

You do realise you need to have a Redis server running locally to be able to connect to it, take a look in your process list for redis-server if its not there and you don't have a registered service you might need to install it. Take a look at the installation instructions on the redis homepage

Related

How to retrieve company Elasticsearch data from Python?

I am very new to Elasticsearch and want to analyze data in python.
I installed Elasticsearch pip and tried to import data but failed with error messages
es = Elasticsearch([{'hosts':'10.251.0.135', 'port':'5601'}])
es.info()
> ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x000001AD21943460>: Failed to establish a new connection: [WinError 10061] caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x000001AD21943460>: Failed to establish a new connection: [WinError 10061]
or
es = Elasticsearch("http://10.251.0.134:5601/")
es.info()
> TransportError: TransportError(302, '')
I looked out some solutions but they kinda assume that I have Elasticsearch in my local machine, which in my case isn't much helpful.
I don't think I am not authorized to access the data as I can access to data through web-hosting Kibana app. Hope to know what would be the problem.
Thanks to leandrojmp, I manage to find the answer.
My situation was:
At work, needed to retrieve Elasticsearch server data to python.
I was the only analyst and others see data through kibana(5601).
No Elasticsearch or Kibana installed on my local machine, so the advice like change configuration doesn't seems to match.
The error was as stated on the question
How I manage to figure out:
I went to port 9200 on the browser, which is direct access to Elasticsearch DB and find out that I only have access to port 5601, not 9200.
Asked Server manager to disable the firewall, and everything works find :)

GCP dataproc with presto - is there a way to run queries remotely via python using pyhive?

I am trying to run queries on a presto cluster I have running on dataproc - via python (using presto from pyhive) on my local machine. But I can't seem to figure out the host URL. Does GCP dataproc even allow accessing the presto clusters remotely?
I tried using the URL on Presto's web UI, but that didn't work either.
I also checked the docs about using Cloud Client Libraries for Python. Wasn't helpful either. https://cloud.google.com/dataproc/docs/tutorials/python-library-example
from pyhive import presto
query = '''select * FROM system.runtime.nodes'''
presto_conn = presto.Connection(host={host}, port=8060, username ={user})
presto_cursor = presto_conn.cursor()
presto_cursor.execute(query)
Error
ConnectionError: HTTPConnectionPool(host='https', port=80): Max retries exceeded with url: {url}
(Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb41c0c25d0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))
Update
I was able to manually create a VM on GCP compute, configure trino and setup firewall rules and load balancer to be able to access the cluster.
Gotta check if dataproc allows similar config.
Looks like Google firewall is blocking connections from the outside world.
How to fix
Quick and dirty solution
Just allow access to ports 8060 from your IP to the dataproc cluster.
This might not scale if you're on a public IP address but will allow you to develop.
It is a bad idea to expose "big data" services to the whole internet. You might get hacked, and Google will shut down the service.
Use a SSH tunnel
Create a small instance (one from the free-tier), expose the SSH port to the inernet, and use port-forwarding.
Your URLs won't be https://dataproc-cluster:8060..., but https://localhost:forwarded_port
This is easy to do and you can turn off that bastion vm when it's not needed.

python salesforce not connecting , home okay, work not. Firewall / Security?

I registered my own salesforce developer login.
I am able to connect to this from my home computer and my work computer via the salesforce login url.
I am now writing python code to extract from salesforce. The code is below.
The code runs on my work laptop when I am at home and connected to my ISP.
When running the same code on my work laptop at work, (so now using work ISP), the code fails to connect.
The error I get when I run at work is:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='login.salesforce.com', port=443): Max retries exceeded with url: /services/Soap/u/40.0 (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))
So expect something going on with firewall and what not.
But am confused. It does not seem right that my work laptop is more "open" to outside when I use my ISP. I would have thought the security / firewall would have been implemented in a layer between work laptop and the ISP. So ISP agnostic. These laptops are meant to be used at home as well. I am not doing anything wrong in that respect.
Python code below.
import unicodecsv
from salesforce_bulk import SalesforceBulk
bulk = SalesforceBulk(password='**', username='**', security_token='**')
job = bulk.create_query_job("Contact", contentType='CSV')
batch = bulk.query(job, "select Id,LastName from Contact")
bulk.close_job(job)
while not bulk.is_batch_done(batch):
sleep(10)
for result in bulk.get_all_results_for_query_batch(batch):
reader = unicodecsv.DictReader(result, encoding='utf-8')
for row in reader:
print(row) # dictionary rows
Oops. Figured it out. Need to add proxies parameter when connecting to salesforce at work. Bit of a revelation. There is a level of security/protection that is missing on a work laptop when it used at home. Did not realise that networks / firewall / security worked in such a way.

I cannot establish a connection to MonetDB using pymonetDB in Python

I'm trying to use MonetDB in Python to perform some tasks on a huge dataframe. I've installed the suggested API and it has successfully loaded. But when I try to establish the connection, I always get the same error:
[Errno 10061] No connection could be made because the target machine actively refused it
In my research, people have said that I need to set the same port between client and server. But I don't know what this means or how to do it. I've tryed many variations, but the basic structure I've been using is the one provided in the documentation:
connection = pymonetdb.connect(database = 'Main_Database', username="monetdb", password="monetdb", hostname="localhost")
I'd like to get to the point where my CSV dataframe is fully loaded and I can start doing operations with it.

Using Pymongo to connect to MongoDB on AWS instance from Windows

An error is repeatedly being thrown at this line:
client = MongoClient('ec2-12-345-67-89.us-east-2.compute.amazonaws.com', 27017,
ssl=True, ssl_keyfile='C:\\mongo.pem')
(Paths and instance name changed for obvious reasons)
The port (27017) for mongo is allowed inbound connections from my AWS security group. First, I allowed only my IP, now I'm allowing all via that port. I have tried preceding the connection string with "mongodb://" and removing the SSL arguments (I'm fairly certain I don't need it).
The error IntelliJ keeps throwing me is:
pymongo.errors.ConnectionFailure: [WinError 10061] No connection could be made because the target machine actively refused it
It works if I transport the script to the AWS instance and replace the DNS with 'localhost' and remove SSL parameters, but I need this to work remotely.
Three ideas:
Ensure "bind_ip" is set to "0.0.0.0" in your mongod.conf and restart mongod, as #ajduke suggests.
Make sure mongod is running.
Try to connect to the mongod from your client machine using the "mongo" shell to see if it gives you a more informative error.

Categories