Extra Characthers showing up after peeking at an MSMQ message - python

I am in the process of upgrading an older legacy system that is using Biztalk, MSMQs, Java, and python.
Currently, I am trying to upgrade a particular piece of the project which when complete will allow me to begin an in-place replacement of many of the legacy systems.
What I have done so far is recreate the legacy system in a newer version of Biztalk (2010) and on a machine that isn't on its last legs.
Anyway, the problem I am having is that there is a piece of Python code that picks up a message from an MSMQ and places it on another server. This code has been in place on our legacy system since 2004 and has worked since then. As far as I know, has never been changed.
Now when I rebuilt this, I started getting errors in the remote server and, after checking a few things out and eliminating many possible problems, I have established that the error occurs somewhere around the time the Python code is picking up from the MSMQ.
The error can be created using just two messages. Please note that I am using sample XMls here as the actual ones are pretty long.
Message one:
<xml>
<field1>Text 1</field1>
<field2>Text 2</field2>
</xml>
Message two:
<xml>
<field1>Text 1</field1>
</xml>
Now if I submit message one followed by message two to the MSMQ, they both appear correctly on the queue. If I then call the Python script, message one is returned correctly but message two gains extra characters.
Post-Python message two:
<xml>
<field1>Text 1</field1>
</xml>1>Te
I thought at first that there might have been scoping problems within the Python code but I have gone through that as well as I can and found none. However, I must admit that the first time that I've looked seriously at Python code is this project.
The Python code first peeks at a message and then receives it. I have been able to see the message when the script peeks and it has the same error message as when it receives.
Also, this error only shows up when going from a longer message to a shorter message.
I would welcome any suggestions of things that might be wrong, or things I could do to identify the problem.
I have googled and searched and gone a little crazy. This is holding an entire project up, as we can't begin replacing the older systems with this piece in place to act as a new bridge.
Thanks for taking the time to read through my problem.
Edit: Here's the relevant Python code:
import sys
import pythoncom
from win32com.client import gencache
msmq = gencache.EnsureModule('{D7D6E071-DCCD-11D0-AA4B-0060970DEBAE}', 0, 1, 0)
def Peek(queue):
qi = msmq.MSMQQueueInfo()
qi.PathName = queue
myq = qi.Open(msmq.constants.MQ_PEEK_ACCESS,0)
if myq.IsOpen:
# Don't loose this pythoncom.Empty thing (it took a while)
tmp = myq.Peek(pythoncom.Empty, pythoncom.Empty, 1)
myq.Close()
return tmp
The function calls this piece of code. I don't have access to the code that calls this until Monday, but the call is basically:
msg= MSMQ.peek()
2nd Edit.
I am attaching the first half of the script. this basically loops around
import base64, xmlrpclib, time
import MSMQ, Config, Logger
import XmlRpcExt,os,whrandom
QueueDetails = Config.InQueueDetails
sleeptime = Config.SleepTime
XMLRPCServer = Config.XMLRPCServer
usingBase64 = Config.base64ing
version=Config.version
verbose=Config.verbose
LogO = Logger.Logger()
def MSMQToIAMS():
# moved svr cons out of daemon loop
LogO.LogP(version)
svr = xmlrpclib.Server(XMLRPCServer, XmlRpcExt.getXmlRpcTransport())
while 1:
GotOne = 0
for qd in QueueDetails:
queue, agency, messagetype = qd
#LogO.LogD('['+version+"] Searching queue %s for messages"%queue)
try:
msg=MSMQ.Peek(queue)
except Exception,e:
LogO.LogE("Peeking at \"%s\" : %s"%(queue, e))
continue
if msg:
try:
msg = msg.__call__().encode('utf-8')
except:
LogO.LogE("Could not convert massege on \"%s\" to a string, leaving it on queue"%queue)
continue
if verbose:
print "++++++++++++++++++++++++++++++++++++++++"
print msg
print "++++++++++++++++++++++++++++++++++++++++"
LogO.LogP("Found Message on \"%s\" : \"%s...\""%(queue, msg[:40]))
try:
rv = svr.accept(msg, agency, messagetype)
if rv[0] != "OK":
raise Exception, rv[0]
LogO.LogP('Message has been sent successfully to IAMS from %s'%queue)
MSMQ.Receive(queue)
GotOne = 1
StoreMsg(msg)
except Exception, e:
LogO.LogE("%s"%e)
if GotOne == 0:
time.sleep(sleeptime)
else:
gotOne = 0
This is the full code that calls MSMQ. Creates a little program that watches MSMQ and when a message arrives picks it up and sends it off to another server.

Sounds really Python-specific (of which I know nothing) rather then MSMQ-specific. Isn't this just a case of a memory variable being used twice without being cleared in between? The second message is shorter than the first so there are characters from the first not being overwritten. What do the relevant parts of the Python code look like?
[[21st April]]
The code just shows you are populating the tmp variable with a message. What happens to tmp before the next message is accessed? I'm assuming it is not cleared.

Related

Selenium Get An Element Quickly

Currently I'm making a python bot for whatsapp manually without APIs or that sort because I am clueless. As such, I'm using Selenium to take in messages and auto reply. Currently, I'm noticing that every few messages, one message doesn't get picked up because the loops ran are too slow and my computer is already pretty fast. Here's the code:
def incoming_msges():
msges = driver.find_elements_by_class_name("message-in")
msgq = []
tq = []
try:
for msg in msges:
txt_msg = msg.find_elements_by_class_name("copyable-text")
time = msg.find_elements_by_class_name("_18lLQ")
for t in time:
tq.append(t.text.lower())
for txt in txt_msg:
msgq.append(txt.text.lower())
msgq = msgq[-1]
tq = tq[-1]
if len(msgq) > 0:
return (msgq, tq)
except StaleElementReferenceException:
pass
return False
Previously, I didn't add the time check thing, and the message sent would be saved, with this loop continuously running such that even if the other party sent the same thing again, the code would not recognise it as a new message because it thinks it's the same one as before. So now, the problem is that my code is super time consuming and I have no idea how to speed it up. I tried doing this:
def incoming_msges():
msges = browser.find_elements_by_class_name("message-in")
try:
msg = msges[-1]
txt_msg = msg.find_element_by_xpath("/span[#class=\"copyable-text\"]").text.lower()
time = msg.find_element_by_xpath("/span[#class=\"_18lLQ\"]").text.lower()
return (txt_msg, time)
except Exception:
pass
return False
However, like this, the code just doesn't find any messages. I have gotten the elements' types and classes correct according to the whatsapp web website but it just doesn't run. What's the correct way of rewriting my first code block as it is still correct? Thanks in advance.
First thing first ...
I definitely recommend using API ... Because what you are trying to do here is to reinvent the wheel. API has the power of telling you if there is a change in your status and you can queue these changes ... So I definitely recommend to use API ... It might be hard at the beginning, but trust me, its worth it ...
Next I would recommend you to use normal variable names. msges msgq tq (these are kindof unreadable and I still dont get what they are supposed to be after reading the code twice ...)
But to your speed problem ... "try - catch (aka except)" blocks are really heavy on a performance ... I would recommend to use safe programming if possible (20 if statements might be faster, but might not a same time) ... Also I think you are kind of unaware of a python language (atleast from what i can see here)
msgq = msgq[-1] # you are telling it to take the last element and change array variable to string .. to be more specific...
msgq ([1,2,3,4]) = msgq[-1] (4) will result to -> msgq = 4 (which in my option hits you performance as well)
tq = tq[-1] # same here
This would be better :)
if len(msgq[-1]) > 0:
return (msgq[-1], tq[-1])
If I understand your code correctly, you are trying to scrape the messages, but if its like you are saying that you want to make auto-reply bot, I would recommend you to eighter get ready for some JS magic or switch tool. I personally noticed that the selenium has a problem with dynamic content ... to be more specific ... once its at the end of the file it does not scrape it again ... so if you do not want to auto refresh every 5-10 seconds to get the latest HTML file, I recommend eighter to create this bot in JS (that will trigger everytime that an element changes) or use the API and use selenium just for responses. I was told that Selenium was created to simulate the common user to check if user interface works as it should (if buttons exists, if the website contains all what it should etc.) ... I think that selenium is for this job something like a flower small sponge for a car clean ... you can do it ... buts gonna cost you alot of time and you might miss some spots (like you missed those messages) ...
Lastly ... the work with strings in general is really costly. you are doing O(n^2) of operations in a try block ... which i can imagine can be really costly ... if its possible, I would reduce the number of inner for loops.
I wish you good luck in this project and I hope you find the answer you seek, while I hope my answer was at least a little helpful.

How to transfer a value from a function in one script to another script without re-running the function (python)?

I'm really new to programming in general and very inexperienced, and I'm learning python as I think it's more simple than other languages. Anyway, I'm trying to use Flask-Ask with ngrok to program an Alexa skill to check data online (which changes a couple of times per hour). The script takes four different numbers (from a different URL) and organizes it into a dictionary, and uses Selenium and phantomjs to access the data.
Obviously, this exceeds the 8-10 second maximum runtime for an intent before Alexa decides that it's taken too long and returns an error message (I know its timing out as ngrok and the python log would show if an actual error occurred, and it invariably occurs after 8-10 seconds even though after 8-10 seconds it should be in the middle of the script). I've read that I could just reprompt it, but I don't know how and that would only give me 8-10 more seconds, and the script usually takes about 25 seconds just to get the data from the internet (and then maybe a second to turn it into a dictionary).
I tried putting the getData function right after the intent that runs when the Alexa skill is first invoked, but it only runs when I initialize my local server and just holds the data for every new Alexa session. Because the data changes frequently, I want it to perform the function every time I start a new session for the skill with Alexa.
So, I decided just to outsource the function that actually gets the data to another script, and make that other script run constantly in a loop. Here's the code I used.
import time
def getData():
username = '' #username hidden for anonymity
password = '' #password hidden for anonymity
browser = webdriver.PhantomJS(executable_path='/usr/local/bin/phantomjs')
browser.get("https://gradebook.com") #actual website name changed
browser.find_element_by_name("username").clear()
browser.find_element_by_name("username").send_keys(username)
browser.find_element_by_name("password").clear()
browser.find_element_by_name("password").send_keys(password)
browser.find_element_by_name("password").send_keys(Keys.RETURN)
global currentgrades
currentgrades = []
gradeids = ['2018202', '2018185', '2018223', '2018626', '2018473', '2018871', '2018886']
for x in range(0, len(gradeids)):
try:
gradeurl = "https://www.gradebook.com/grades/"
browser.get(gradeurl)
grade = browser.find_element_by_id("currentStudentGrade[]").get_attribute('innerHTML').encode('utf8')[0:3]
if grade[2] != "%":
grade = browser.find_element_by_id("currentStudentGrade[]").get_attribute('innerHTML').encode('utf8')[0:4]
if grade[1] == "%":
grade = browser.find_element_by_id("currentStudentGrade[]").get_attribute('innerHTML').encode('utf8')[0:1]
currentgrades.append(grade)
except Exception:
currentgrades.append('No assignments found')
continue
dictionary = {"class1": currentgrades[0], "class2": currentgrades[1], "class3": currentgrades[2], "class4": currentgrades[3], "class5": currentgrades[4], "class6": currentgrades[5], "class7": currentgrades[6]}
return dictionary
def run():
dictionary = getData()
time.sleep(60)
That script runs constantly and does what I want, but then in my other script, I don't know how to just call the dictionary variable. When I use
from getdata.py import dictionary
in the Flask-ask script it just runs the loop and constantly gets the data. I just want the Flask-ask script to take the variable defined in the "run" function and then use it without running any of the actual scripts defined in the getdata script, which have already run and gotten the correct data. If it matters, both scripts are running in Terminal on a MacBook.
Is there any way to do what I'm asking about, or are there any easier workarounds? Any and all help is appreciated!
It sounds like you want to import the function, so you can run it; rather than importing the dictionary.
try deleting the run function and then in your other script
from getdata import getData
Then each time you write getData() it will run your code and get a new up-to-date dictionary.
Is this what you were asking about?
This issue has been resolved.
As for the original question, I didn't figure out how to make it just import the dictionary instead of first running the function to generate the dictionary. Furthermore, I realized there had to be a more practical solution than constantly running a script like that, and even then not getting brand new data.
My solution was to make the script that gets the data start running at the same time as the launch function. Here was the final script for the first intent (the rest of it remained the same):
#ask.intent("start_skill")
def start_skill():
welcome_message = 'What is the password?'
thread = threading.Thread(target=getData, args=())
thread.daemon = True
thread.start()
return question(welcome_message)
def getData():
#script to get data here
#other intents and rest of script here
By design, the skill requested a numeric passcode to make sure I was the one using it before it was willing to read the data (which was probably pointless, but this skill is at least as much for my own educational reasons as for practical reasons, so, for the extra practice, I wanted this to have as many features as I could possibly justify). So, by the time you would actually be able to ask for the data, the script to get the data will have finished running (I have tested this and it seems to work without fail).

Python: PySerial disconnecting from device randomly

I have a process that runs data acquisition using PySerial. It's working fine now, but there's a weird thing I had to do to make it work continuously, and I'm not sure this is normal, so I'm asking this question.
What happens: It looks like that the connection drops now and then! Around once every 30-60 minutes, with big error bars (could go for hours and be OK, but sometimes happens often).
My question: Is this standard?
My temporary solution: I wrote a simple "reopen" function that looks like this:
def ReopenDevice(devObject):
try:
devObject.close()
devObject.open()
except Exception as e:
print("Error while trying to connect to device " + devObject.port + ". The error says: " + str(e))
time.sleep(2)
And what I do is that if data pulling fails for 2 minutes, I reopen the device with this function, and it continues working well with no problems.
My program model: It's a GUI program, where the user clicks something like "Start", and that button does some preparations and runs a function through multiprocessing.Process() that starts with:
devObj = serial.Serial()
#... other params
devObj.open()
and that function then runs a while loop that keeps polling data with something like:
bytesToRead = devObj.inWaiting()
if bytesToRead != 0:
buffer = decodeString(devObj.read(bytesToRead))
#process buffer and push it to a list...
The way I know that the problem happened, is that devObj.inWaiting() Keeps returning zero... no matter how much data there's on the device!
Is this behavior expected and should always be considered whether it happens or doesn't happen?
The problem reduced a lot after not calling inWaiting() very frequently. Anyway, I kept the reconnect part to ensure that my program never fails. Thanks for "Kobi K" for suggesting the possible cause of the problem.

Python Memory leak using Yocto

I'm running a python script on a raspberry pi that constantly checks on a Yocto button and when it gets pressed it puts data from a different sensor in a database.
a code snippet of what constantly runs is:
#when all set and done run the program
Active = True
while Active:
if ResponseType == "b":
while Active:
try:
if GetButtonPressed(ResponseValue):
DoAllSensors()
time.sleep(5)
else:
time.sleep(0.5)
except KeyboardInterrupt:
Active = False
except Exception, e:
print str(e)
print "exeption raised continueing after 10seconds"
time.sleep(10)
the GetButtonPressed(ResponseValue) looks like the following:
def GetButtonPressed(number):
global buttons
if ModuleCheck():
if buttons[number - 1].get_calibratedValue() < 300:
return True
else:
print "module not online"
return False
def ModuleCheck():
global moduleb
return moduleb.isOnline()
I'm not quite sure about what might be going wrong. But it takes about an hour before the RPI runs out of memory.
The memory increases in size constantly and the button is only pressed once every 15 minutes or so.
That already tells me that the problem must be in the code displayed above.
The problem is that the yocto_api.YAPI object will continue to accumulate _Event objects in its _DataEvents dict (a class-wide attribute) until you call YAPI.YHandleEvents. If you're not using the API's callbacks, it's easy to think (I did, for hours) that you don't need to ever call this. The API docs aren't at all clear on the point:
If your program includes significant loops, you may want to include a call to this function to make sure that the library takes care of the information pushed by the modules on the communication channels. This is not strictly necessary, but it may improve the reactivity of the library for the following commands.
I did some playing around with API-level callbacks before I decided to periodically poll the sensors in my own code, and it's possible that some setting got left enabled in them that is causing these events to accumulate. If that's not the case, I can't imagine why they would say calling YHandleEvents is "not strictly necessary," unless they make ARM devices with unlimited RAM in Switzerland.
Here's the magic static method that thou shalt call periodically, no matter what. I'm doing so once every five seconds and that is taking care of the problem without loading down the system at all. API code that would accumulate unwanted events still smells to me, but it's time to move on.
#noinspection PyUnresolvedReferences
#staticmethod
def HandleEvents(errmsgRef=None):
"""
Maintains the device-to-library communication channel.
If your program includes significant loops, you may want to include
a call to this function to make sure that the library takes care of
the information pushed by the modules on the communication channels.
This is not strictly necessary, but it may improve the reactivity
of the library for the following commands.
This function may signal an error in case there is a communication problem
while contacting a module.
#param errmsg : a string passed by reference to receive any error message.
#return YAPI.SUCCESS when the call succeeds.
On failure, throws an exception or returns a negative error code.
"""
errBuffer = ctypes.create_string_buffer(YAPI.YOCTO_ERRMSG_LEN)
#noinspection PyUnresolvedReferences
res = YAPI._yapiHandleEvents(errBuffer)
if YAPI.YISERR(res):
if errmsgRef is not None:
#noinspection PyAttributeOutsideInit
errmsgRef.value = YByte2String(errBuffer.value)
return res
while len(YAPI._DataEvents) > 0:
YAPI.yapiLockFunctionCallBack(errmsgRef)
if not (len(YAPI._DataEvents)):
YAPI.yapiUnlockFunctionCallBack(errmsgRef)
break
ev = YAPI._DataEvents.pop(0)
YAPI.yapiUnlockFunctionCallBack(errmsgRef)
ev.invokeData()
return YAPI.SUCCESS

Have a python function run for an alotted time

I have a python script that pulls from various internal network sources. With how our systems are set up we will initiate a urllib pull from a network location and it will get hung up waiting forever for a response on certain parts of the network. I would like my script to check that if it hasnt finished the pull in lets say 5 minutes it will pass the function and attempt to pull from the next address, and record it to a bad directory repository(so we can go check out which systems get hung up, there's like over 20,000 IP addresses we are checking some with some older scripts running on them that no longer work but will still try and run when requested, and they never stop trying to run)
Im familiar with having a script pause at a certain point
import time
time.sleep(300)
What Im thinking from a psuedo code perspective (not proper python just illustrating the idea)
import time
import urllib2
url_dict = ['http://1', 'http://2', 'http://3', ...]
fail_log_path = 'C:/Temp/fail_log.txt'
for addresses in url_dict:
clock_value = time.start()
while clock_value <= 300:
print str(clock_value)
res = urllib2.retrieve(url)
if res != []:
pass
else:
fail_log = open(fail_log_path, 'a')
fail_log.write("Failed to pull from site location: " + str(url) + "\n")
faile_log.close
Update: a specific option for this dealing with urls timeout for urllib2.urlopen() in pre Python 2.6 versions
Found this answer which is more in line with the overall problem of my question:
kill a function after a certain time in windows
Your code as is doesn't seem to describe what you were saying. It seems you want the if/else check inside your while loop. On top of that, you would want to loop over the ip addresses and not over a time period as your code is currently written (otherwise you will keep requesting the same ip address every time). Instead of keeping track of time yourself, I would suggest reading up on urllib.request.urlopen - specifically the timeout parameter. Once set, that function call will throw a socket.timeout exception once the time limit is reached. Surround that with a try/except block catching that error and then handle it appropriately.

Categories