I am using GCP with its Cloud Functions to execute web scrapers on a frequent basis. Also locally, my script is working without any problems.
I have a setup.py file in which I am initializing the connection to a Kafka Producer. This looks like this:
p = Producer(
{
"bootstrap.servers": os.environ.get("BOOTSTRAP.SERVERS"),
"security.protocol": os.environ.get("SECURITY.PROTOCOL"),
"sasl.mechanisms": os.environ.get("SASL.MECHANISMS"),
"sasl.username": os.environ.get("SASL.USERNAME"),
"sasl.password": os.environ.get("SASL.PASSWORD"),
"session.timeout.ms": os.environ.get("SESSION.TIMEOUT.MS")
}
)
def delivery_report(err, msg):
"""Called once for each message produced to indicate delivery result.
Triggered by poll() or flush()."""
print("Got here!")
if err is not None:
print("Message delivery failed: {}".format(err))
else:
print("Message delivered to {} [{}]".format(msg.topic(), msg.partition()))
return "DONE."
I am importing this setup in main.py in which my scraping functions are defined. This looks similar to this:
from setup import p, delivery_report
def scraper():
try:
# I won't insert my whole scraper here since it's working fine ...
print(scraped_data_as_dict)
p.produce(topic, json.dumps(scraped_data_as_dict), callback=delivery_report)
p.poll(0)
except Exception as e:
# Do sth else
The point here is: I am printing my scraped data in the console. But it doesn't do anything with the producer. It's not even logging an failed producer message (deliver_report) on the console. It's like my script is ignoring the producer command. Also, there are no Error reports in the LOG of the Cloud Function. What am I doing wrong since the function is doing something, except the important stuff? What do I have to be aware of when connection Kafka with Cloud Functions?
I am trying to create a simple HTTP server that uses the Python HTTPServer which inherits BaseHTTPServer. [https://github.com/python/cpython/blob/main/Lib/http/server.py][1]
There are numerous examples of this approach online and I don't believe I am doing anything unusual.
I am simply importing the class via:
"from http.server import HTTPServer, BaseHTTPRequestHandler"
in my code.
My code overrides the do_GET() method to parse the path variable to determine what page to show.
However, if I start this server and connect to it locally (ex: http://127.0.0.1:50000) the first page loads fine. If I navigate to another page (via my first page links) that too works fine, however, on occasion (and this is somewhat sporadic), there is a delay and the server log shows a Request timed out: timeout('timed out') error. I have tracked this down to the handle_one_request method in the BaseHTTPServer class:
def handle_one_request(self):
"""Handle a single HTTP request.
You normally don't need to override this method; see the class
__doc__ string for information on how to handle specific HTTP
commands such as GET and POST.
"""
try:
self.raw_requestline = self.rfile.readline(65537)
if len(self.raw_requestline) > 65536:
self.requestline = ''
self.request_version = ''
self.command = ''
self.send_error(HTTPStatus.REQUEST_URI_TOO_LONG)
return
if not self.raw_requestline:
self.close_connection = True
return
if not self.parse_request():
# An error code has been sent, just exit
return
mname = 'do_' + self.command ## the name of the method is created
if not hasattr(self, mname): ## checking that we have that method defined
self.send_error(
HTTPStatus.NOT_IMPLEMENTED,
"Unsupported method (%r)" % self.command)
return
method = getattr(self, mname) ## getting that method
method() ## finally calling it
self.wfile.flush() #actually send the response if not already done.
except socket.timeout as e:
# a read or a write timed out. Discard this connection
self.log_error("Request timed out: %r", e)
self.close_connection = True
return
You can see where the exception is thrown in the "except socket.timeout as e:" clause.
I have tried overriding this method by including it in my code but it is not clear what is causing the error so I run into dead ends. I've tried creating very basic HTML pages to see if there was something in the page itself, but even "blank" pages cause the same sporadic issue.
What's odd is that sometimes a page loads instantly, and almost randomly, it will then timeout. Sometimes the same page, sometimes a different page.
I've played with the http.timeout setting, but it makes no difference. I suspect it's some underlying socket issue, but am unable to diagnose it further.
This is on a Mac running Big Sur 11.3.1, with Python version 3.9.4.
Any ideas on what might be causing this timeout, and in particular any suggestions on a resolution. Any pointers would be appreciated.
After further investigation, this particular appears to be an issue with Safari. Running the exact same code and using Firefox does not show the same issue.
I've been working with the example-minimal.py script from https://github.com/toddmedema/echo and need to alter it so that rather than printing the status changes to the terminal, it executes another script.
I'm a rank amateur but eager to learn and even more eager to get this project done.
Thanks in advance for any help you can provide!!
""" fauxmo_minimal.py - Fabricate.IO
This is a demo python file showing what can be done with the debounce_handler.
The handler prints True when you say "Alexa, device on" and False when you say
"Alexa, device off".
If you have two or more Echos, it only handles the one that hears you more clearly.
You can have an Echo per room and not worry about your handlers triggering for
those other rooms.
The IP of the triggering Echo is also passed into the act() function, so you can
do different things based on which Echo triggered the handler.
"""
import fauxmo
import logging
import time
from debounce_handler import debounce_handler
logging.basicConfig(level=logging.DEBUG)
class device_handler(debounce_handler):
"""Publishes the on/off state requested,
and the IP address of the Echo making the request.
"""
TRIGGERS = {"device": 52000}
def act(self, client_address, state, name):
print "State", state, "on ", name, "from client #", client_address
return True
if __name__ == "__main__":
# Startup the fauxmo server
fauxmo.DEBUG = True
p = fauxmo.poller()
u = fauxmo.upnp_broadcast_responder()
u.init_socket()
p.add(u)
# Register the device callback as a fauxmo handler
d = device_handler()
for trig, port in d.TRIGGERS.items():
fauxmo.fauxmo(trig, u, p, None, port, d)
# Loop and poll for incoming Echo requests
logging.debug("Entering fauxmo polling loop")
while True:
try:
# Allow time for a ctrl-c to stop the process
p.poll(100)
time.sleep(0.1)
except Exception, e:
logging.critical("Critical exception: " + str(e))
break
I'm going to try and be helpful by going through that script and explaining what each bit does. This should help you understand what it's doing, and therefore what you need to do to get it running something else:
import fauxmo
This is a library that allows whatever device is running the script to pretend to be a Belkin WeMo; a device that is triggerable by the Echo.
import logging
import time
from debounce_handler import debounce_handler
This is importing some more libraries that the script will need. Logging will be used for logging things, which is useful for debugging, time will be used to cause the script to pause so that you can quit it by typing ctrl-c, and the debounce_handler library will be used to keep multiple Echos from reacting to the same voice command (which would cause a software bounce).
logging.basicConfig(level=logging.DEBUG)
Configures a logger that will allow events to be logged to assist in debugging.
class device_handler(debounce_handler):
"""Publishes the on/off state requested,
and the IP address of the Echo making the request.
"""
TRIGGERS = {"device": 52000}
def act(self, client_address, state, name):
print "State", state, "on ", name, "from client #", client_address
return True
We've created a class called device_handler which contains a dictionary called TRIGGERS and a function called act.
act takes a number of variables as input; self (any data structures in the class, such as our TRIGGERS dictionary), client_address, state, and name. We don't know what these are yet, but the names are quite self explanatory, so we can guess that client_address is probably going to be the IP address of the Echo, *state" that it is in, and name will be its name. This is the function that you're going to want to edit, since it is the final function triggered by the Echo. You can probably just stick whatever you function you want after the print statement. The act function returns True when called.
if __name__ == "__main__":
This will execute everything indented below it if you're running the script directly. More detail about that here if you want it.
# Startup the fauxmo server
fauxmo.DEBUG = True
p = fauxmo.poller()
u = fauxmo.upnp_broadcast_responder()
u.init_socket()
p.add(u)
As the comment suggests, this starts the fake WeMo server. We enable debugging, which just prints any debug messages to the command line, create a poller, p, which can process incoming messages, and create a upnp broadcast responder, u, which can handle UPnP device registration. We then tell u to initialise a socket, setting itself up on the network listening for UPnP devices, and add u to p so that we can respond when a broadcast is received.
# Register the device callback as a fauxmo handler
d = device_handler()
for trig, port in d.TRIGGERS.items():
fauxmo.fauxmo(trig, u, p, None, port, d)
As the comment says, this sets up an instance of the device handler class that we made earlier. Now we for-loop through the items in our TRIGGERS dictionary in our device handler d and calls fauxmo.fauxmo using the information it has found in the dictionary. If we look at the dictionary definition in the class definition we can see that there's only one entry, a trig device on port 52000. This essentially does the bulk of the work, making the actual fake WeMo device talk to the Echo. If we look at that fauxmo.fauxmo function we see that, when it receives a suitable trigger it calls the act function in the device_handler class we defined before.
# Loop and poll for incoming Echo requests
logging.debug("Entering fauxmo polling loop")
while True:
try:
# Allow time for a ctrl-c to stop the process
p.poll(100)
time.sleep(0.1)
except Exception, e:
logging.critical("Critical exception: " + str(e))
break
And here we enter the fauxmo polling loop. This indefinitely loops through the following code, checking to see if we've received a message. The code below it tries to poll for messages, to see if its received anything, then wait for a bit, then poll again. Except, if it can't do that for some reason, then the script will break and the error will be logged so you can see what went wrong.
Just to clarify; If the Fauxmo loop is running then the script is fine, right?
I think the TO is not getting any connection between the Echo and the WeMo fake device. It can help if you install the WeMo skill first. You may require an original WeMo device initially though.
I know these are old threads but it might help someone still.
I've made an IRC bot in Python and I've been trying to figure out a way to wait for an IRC command and return the message to a calling function for a while now. I refuse to use an external library for various reasons including I'm trying to learn to make these things from scratch. Also, I've been sifting through documentation for existing ones and they're way too comprehensive. I'me trying to make a simple one.
For example:
def who(bot, nick):
bot.send('WHO %s' % nick)
response = ResponseWaiter('352') # 352 - RPL_WHOREPLY
return response.msg
Would return a an object of my Message class that parses IRC messages to the calling function:
def check_host(bot, nick, host):
who_info = who(bot, nick)
if who_info.host == host:
return True
return False
I have looked at the reactor pattern, observer pattern, and have tried implementing a hundred different event system designs for this to no avail. I'm completely lost.
Please either provide a solution or point me in the right direction. There's got to be a simple way to do this.
So what I've done is use grab messages from my generator (a bot method) from the bot's who method. The generator looks like this:
def msg_generator(self):
''' Provides messages until bot dies '''
while self.alive:
for msg in self.irc.recv(self.buffer).split(('\r\n').encode()):
if len(msg) > 3:
try: yield Message(msg.decode())
except Exception as e:
self.log('%s %s\n' % (except_str, str(e)))
And now the bot's who method looks like this:
def who(self, nick):
self.send('WHO %s' % nick)
for msg in self.msg_generator():
if msg.command == '352':
return msg
However, it's now taking control of the messages, so I need some way of relinquishing the messages I'm not using for the who method to their appropriate handlers.
My bot generally handles all messages with this:
def handle(self):
for msg in self.msg_generator():
self.log('◀ %s' % (msg))
SpaghettiHandler(self, msg)
So any message that my SpaghettiHandler would be handling is not handled while the bot's who method uses the generator to receive messages.
It's working.. and works fast enough that it's hard to lose a message. But if my bot were to be taking many commands at the same time, this could become a problem. I'm pretty sure I'll find a solution in this direction, but I didn't create this as the answer because I'm not sure it's a good way, even when I have it set to relinquish messages that don't pertain to the listener.
I writing little customized ftp server and I need to suppress printing exceptions (well, one specific type of exception) to console but I want server to send "550 Requested action not taken: internal server error" or something like that to client.
However, when I catch exception using addErrback(), than I don't see exception in console but client gets OK status..
What could I do?
When you catch an error in errback handler, you should then inspect the type of the Failure and based on internal logic of your application send the Error as an FTP error message to the client
twisted.protocol.ftp.FTP handles this with self.reply(ERROR_CODE, "description")
So your code could look something like this:
from twisted.internet import ftp
MY_ERROR = ftp.REQ_ACTN_NOT_TAKEN
def failureCheck(failureInstance):
#do some magic to establish if we should reply an Error to this failure
return True
class myFTP(ftp.FTP):
def myActionX(self):
magicResult = self.doDeferredMagic()
magicResult.addCallback(self.onMagicSuccess)
magicResult.addErrback(self.onFailedMagic)
def onFailedMagic(self,failureInstance):
if failureCheck(failureInstance):
self.reply(MY_ERROR,'Add relevant failure information here')
else:
#do whatever other logic here
pass