python process communications via pipes: Race condition

python process communications via pipes: Race condition - python

So I have two Python3.2 processes that need to communicate with each other. Most of the information that needs to be communicated are standard dictionaries. Named pipes seemed like the way to go so I made a pipe class that can be instantiated in both processes. this class implements a very basic protocol for getting information around.
My problem is that sometimes it works, sometimes it doesn't. There seems to be no pattern to this behavior except the place where the code fails.
Here are the bits of the Pipe class that matter. Shout if you want more code:
class Pipe:
"""
there are a bunch of constants set up here. I dont think it would be useful to include them. Just think like this: Pipe.WHATEVER = 'WHATEVER'
"""
def __init__(self,sPath):
"""
create the fifo. if it already exists just associate with it
"""
self.sPath = sPath
if not os.path.exists(sPath):
os.mkfifo(sPath)
self.iFH = os.open(sPath,os.O_RDWR | os.O_NONBLOCK)
self.iFHBlocking = os.open(sPath,os.O_RDWR)
def write(self,dMessage):
"""
write the dict to the fifo
if dMessage is not a dictionary then there will be an exception here. There never is
"""
self.writeln(Pipe.MESSAGE_START)
for k in dMessage:
self.writeln(Pipe.KEY)
self.writeln(k)
self.writeln(Pipe.VALUE)
self.writeln(dMessage[k])
self.writeln(Pipe.MESSAGE_END)
def writeln(self,s):
os.write(self.iFH,bytes('{0} : {1}\n'.format(Pipe.LINE_START,len(s)+1),'utf-8'))
os.write(self.iFH,bytes('{0}\n'.format(s), 'utf-8'))
os.write(self.iFH,bytes(Pipe.LINE_END+'\n','utf-8'))
def readln(self):
"""
look for LINE_START, get line size
read until LINE_END
clean up
return string
"""
iLineStartBaseLength = len(self.LINE_START)+3 #'{0} : '
try:
s = os.read(self.iFH,iLineStartBaseLength).decode('utf-8')
except:
return Pipe.READLINE_FAIL
if Pipe.LINE_START in s:
#get the length of the line
sLineLen = ''
while True:
try:
sCurrent = os.read(self.iFH,1).decode('utf-8')
except:
return Pipe.READLINE_FAIL
if sCurrent == '\n':
break
sLineLen += sCurrent
try:
iLineLen = int(sLineLen.strip(string.punctuation+string.whitespace))
except:
raise Exception('Not a valid line length: "{0}"'.format(sLineLen))
#read the line
sLine = os.read(self.iFHBlocking,iLineLen).decode('utf-8')
#read the line terminator
sTerm = os.read(self.iFH,len(Pipe.LINE_END+'\n')).decode('utf-8')
if sTerm == Pipe.LINE_END+'\n':
return sLine
return Pipe.READLINE_FAIL
else:
return Pipe.READLINE_FAIL
def read(self):
"""
read from the fifo, make a dict
"""
dRet = {}
sKey = ''
sValue = ''
sCurrent = None
def value_flush():
nonlocal dRet, sKey, sValue, sCurrent
if sKey:
dRet[sKey.strip()] = sValue.strip()
sKey = ''
sValue = ''
sCurrent = ''
if self.message_start():
while True:
sLine = self.readln()
if Pipe.MESSAGE_END in sLine:
value_flush()
return dRet
elif Pipe.KEY in sLine:
value_flush()
sCurrent = Pipe.KEY
elif Pipe.VALUE in sLine:
sCurrent = Pipe.VALUE
else:
if sCurrent == Pipe.VALUE:
sValue += sLine
elif sCurrent == Pipe.KEY:
sKey += sLine
else:
return Pipe.NO_MESSAGE
It sometimes fails here (in readln):
try:
iLineLen = int(sLineLen.strip(string.punctuation+string.whitespace))
except:
raise Exception('Not a valid line length: "{0}"'.format(sLineLen))
It doesn't fail anywhere else.
An example error is:
Not a valid line length: "KE 17"
The fact that it's intermittent says to me that it's due to some kind of race condition, I'm just struggling to figure out what it might be. Any ideas?
EDIT added stuff about calling processes
How the Pipe is used is it is instantiated in processA and ProcessB by calling the constructor with the same path. Process A will then intermittently write to the Pipe and processB will try to read from it. At no point do I ever try to get the thing acting as a two way.
Here is a more long winded explanation of the situation. I've been trying to keep the question short but I think it's about time I give up on that. Anyhoo, I have a daemon and a Pyramid process that need to play nice. There are two Pipe instances in use: One that only Pyramid writes to, and one that only the daemon writes to. The stuff Pyramid writes is really short, I have experienced no errors on this pipe. The stuff that the daemon writes is much longer, this is the pipe that's giving me grief. Both pipes are implemented in the same way. Both processes only write dictionaries to their respective Pipes (if this were not the case then there would be an exception in Pipe.write).
The basic algorithm is: Pyramid spawns the daemon, the daemon loads craze object hierarchy of doom and vast ram consumption. Pyramid sends POST requests to the daemon which then does a whole bunch of calculations and sends data to Pyramid so that a human-friendly page can be rendered. the human can then respond to what's in the hierarchy by filling in HTML forms and suchlike thus causing pyramid to send another dictionary to the daemon, and the daemon sending back a dictionary response.
So: only one pipe has exhibited any problems, the problem pipe has a lot more traffic than the other one, and it is a guarentee that only dictionaries are written to either
EDIT as response to question and comment
Before you tell me to take out the try...except stuff read on.
The fact that the exception gets raised at all is what is bothering me. iLineLengh = int(stuff) looks to me like it should always be passed a string that looks like an integer. This is the case only most of the time, not all of it. So if you feel the urge to comment about how it's probably not an integer please please don't.
To paraphrase my question: Spot the race condition and you will be my hero.
EDIT a little example:
process_1.py:
oP = Pipe(some_path)
while 1:
oP.write({'a':'foo','b':'bar','c':'erm...','d':'plop!','e':'etc'})
process_2.py:
oP = Pipe(same_path_as_before)
while 1:
print(oP.read())

After playing around with the code, I suspect the problem is coming from how you are reading the file.
Specifically, lines like this:
os.read(self.iFH, iLineStartBaseLength)
That call doesn't necessarily return iLineStartBaseLength bytes - it might consume "LI" , then return READLINE_FAIL and retry. On the second attempt, it will get the remainder of the line, and somehow end up giving the non-numeric string to the int() call
The unpredictability likely comes from how the fifo is being flushed - if it happens to flush when the complete line is written, all is fine. If it flushes when the line is half-written, weirdness.
At least in the hacked-up version of the script I ended up with, the oP.read() call in process_2.py often got a different dict to the one sent (where the KEY might bleed into the previous VALUE and other strangeness).
I might be mistaken, as I had to make a bunch of changes to get the code running on OS X, and further while experimenting. My modified code here
Not sure exactly how to fix it, but.. with the json module or similar, the protocol/parsing can be greatly simplified - newline separated JSON data is much easier to parse:
import os
import time
import json
import errno
def retry_write(*args, **kwargs):
"""Like os.write, but retries until EAGAIN stops appearing
"""
while True:
try:
return os.write(*args, **kwargs)
except OSError as e:
if e.errno == errno.EAGAIN:
time.sleep(0.5)
else:
raise
class Pipe(object):
"""FIFO based IPC based on newline-separated JSON
"""
ENCODING = 'utf-8'
def __init__(self,sPath):
self.sPath = sPath
if not os.path.exists(sPath):
os.mkfifo(sPath)
self.fd = os.open(sPath,os.O_RDWR | os.O_NONBLOCK)
self.file_blocking = open(sPath, "r", encoding=self.ENCODING)
def write(self, dmsg):
serialised = json.dumps(dmsg) + "\n"
dat = bytes(serialised.encode(self.ENCODING))
# This blocks until data can be read by other process.
# Can just use os.write and ignore EAGAIN if you want
# to drop the data
retry_write(self.fd, dat)
def read(self):
serialised = self.file_blocking.readline()
return json.loads(serialised)

Try getting rid of the try:, except: blocks and seeing what exception is actually being thrown.
So replace your sample with just:
iLineLen = int(sLineLen.strip(string.punctuation+string.whitespace))
I bet it'll now throw a ValueError, and it's because you're trying to cast "KE 17" to an int.
You'll need to strip more than string.whitespace and string.punctuation if you're going to cast the string to an int.

Related

Loop to check if a variable has changed in Python

I have just learned the basics of Python, and I am trying to make a few projects so that I can increase my knowledge of the programming language.
Since I am rather paranoid, I created a script that uses PycURL to fetch my current IP address every x seconds, for VPN security. Here is my code[EDITED]:
import requests
enterIP = str(input("What is your current IP address?"))
def getIP():
while True:
try:
result = requests.get("http://ipinfo.io/ip")
print(result.text)
except KeyboardInterrupt:
print("\nProccess terminated by user")
return result.text
def checkIP():
while True:
if enterIP == result.text:
pass
else:
print("IP has changed!")
getIP()
checkIP()
Now I would like to expand the idea, so that the script asks the user to enter their current IP, saves that octet as a string, then uses a loop to keep running it against the PycURL function to make sure that their IP hasn't changed? The only problem is that I am completely stumped, I cannot come up with a function that would take the output of PycURL and compare it to a string. How could I achieve that?

As #holdenweb explained, you do not need pycurl for such a simple task, but nevertheless, here is a working example:
import pycurl
import time
from StringIO import StringIO
def get_ip():
buffer = StringIO()
c = pycurl.Curl()
c.setopt(pycurl.URL, "http://ipinfo.io/ip")
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()
return buffer.getvalue()
def main():
initial = get_ip()
print 'Initial IP: %s' % initial
try:
while True:
current = get_ip()
if current != initial:
print 'IP has changed to: %s' % current
time.sleep(300)
except KeyboardInterrupt:
print("\nProccess terminated by user")
if __name__ == '__main__':
main()
As you can see I moved the logic of getting the IP to separate function: get_ip and added few missing things, like catching the buffer to a string and returning it. Otherwise it is pretty much the same as the first example in pycurl quickstart
The main function is called below, when the script is accessed directly (not by import).
First off it calls the get_ip to get initial IP and then runs the while loop which checks if the IP has changed and lets you know if so.
EDIT:
Since you changed your question, here is your new code in a working example:
import requests
def getIP():
result = requests.get("http://ipinfo.io/ip")
return result.text
def checkIP():
initial = getIP()
print("Initial IP: {}".format(initial))
while True:
current = getIP()
if initial == current:
pass
else:
print("IP has changed!")
checkIP()
As I mentioned in the comments above, you do not need two loops. One is enough. You don't even need two functions, but better do. One for getting the data and one for the loop. In the later, first get initial value and then run the loop, inside which you check if value has changed or not.

It seems, from reading the pycurl documentation, like you would find it easier to solve this problem using the requests library. Curl is more to do with file transfer, so the library expects you to provide a file-like object into which it writes the contents. This would greatly complicate your logic.
requests allows you to access the text of the server's response directly:
>>> import requests
>>> result = requests.get("http://ipinfo.io/ip")
>>> result.text
'151.231.192.8\n'
As #PeterWood suggested, a function would be more appropriate than a class for this - or if the script is going to run continuously, just a simple loop as the body of the program.

What's the best way to implement an SCPI command tree structure as Python class methods?

SCPI commands are strings built of mnemonics that are sent to an instrument to modify/retrieve its settings and read measurements. I'd like to be able to build and send a string like this: "SENSe:VOLTage:DC:RANGe 10 V"
With code like this:
inst.sense.voltage.dc.range('10 V')
But it seems like a very deep rabbit hole and I'm not sure I want to start down it. Loads of classes for each subsystem and option....
Would a better approach be something like:
def sense_volt(self, currenttype='DC', RANG='10 V'):
cmdStr = 'SENS:VOLT:' + currenttype + ':RANG: ' + RANG
return self.inst.write(cmdStr)
It's trivial to just implement the commands I need and leave inst.query(arg) and inst.write(arg) methods for manually building additional commands, but I'd like to eventually have a complete instrument interface where all commands are covered by autocompleteable methods.

I managed it, though I'm pretty unhappy with the results. It's been three years since I touched any of this, so I'm not sure why I did some things the way I did. I know I had a lot of trouble wrapping my head around organizing everything for portability. I'm sure that now, after a long wait, I'll be inundated with suggestions for how I should have done it.
I wrote parsing functions to interpret SCPI commands. They can separate the command from the arguments, determine required and optional arguments, recognize queries, etc. For example:
def command_name(scpi_command):
cmd = scpi_command.split(' ')[0]
new = ''
for i, c in enumerate(cmd):
if c in '[]':
continue
if c.isupper() or c.isdigit():
new += c.lower()
continue
if c == ':':
new += '_'
continue
if c == '?':
new += '_qry'
break
return new
The command from CONFigure[:VOLTage]:DC [{<range>|AUTO|MIN|MAX|DEF} [, {<resolution>|MIN|MAX|DEF}]] can be parsed as conf_volt_dc, or conf_dc. I did not use the abridged option in my experiments. I think some of my dissatisfaction stemmed from the extra-long method names.
I wrote a builder script to read the commands from a file, parse them, and write a new script with a "command set" class extending a "command handler." Every command is parsed and converted into one of a few boilerplate methods, such as:
query_str_no_args = r'''
def {name}(self):
"""SCPI instrument query.
{s}
"""
cmd = '{s}'
return self._command_handler(command=cmd)
'''
The parser also includes a command handler class to process commands and arguments. Each command is parsed from its example text, ala >>> _command_handler(0.1, 'MAX', command='CONFigure[:VOLTage]:DC [{<range>|AUTO|MIN|MAX|DEF} [, {<resolution>|MIN|MAX|DEF}]]'. Arguments (if any) are validated, and the command string is rebuilt and sent to the instrument,
class Cmd_Handler():
def _command_handler(self, *args, command):
# command string 'SENS:VOLT:RANG'
cmd_str = command_string(command)
# argument dictionary {0: [True, '<range>, 'AUTO', 'MIN', 'MAX', 'DEF'],
# 1: [False, '<resolution>', 'MIN', 'MAX', 'DEF']}
arg_dict = command_args(command)
if debug_mode: print(arg_dict)
for k, v in arg_dict.items():
for i, j in enumerate(v[1:]):
...
# Validate arguments
# count mandatory arguments
if len(args) < len([arg_dict[k] for k in arg_dict.keys() if arg_dict[k][0]]):
...
if '?' in cmd_str:
return self._query(cmd_str + arg_str)
else:
return self._write(cmd_str + arg_str)
When the builder is done, you end up with a big script (the Ag34401 and Ag34461 files were 2600 and 3600 lines, respectively) that looks like this:
#!/usr/bin/env python3
from scpi.scpi_parse import Cmd_Handler
class Ag34461A_CS(Cmd_Handler):
...
def calc_scal_stat(self, *args):
"""SCPI instrument command.
CALCulate:SCALe[:STATe] {OFF|ON}
"""
cmd = 'CALCulate:SCALe[:STATe] {OFF|ON}'
return self._command_handler(*args, command=cmd)
def calc_scal_stat_qry(self):
"""SCPI instrument query.
CALCulate:SCALe[:STATe]?
"""
cmd = 'CALCulate:SCALe[:STATe]?'
return self._command_handler(command=cmd)
Then you extend this class with your short, sweet instrument class:
#!/usr/bin/env python3
import pyvisa as visa
from scpi.cmd_sets.ag34401a_cs import Ag34401A_CS # Extend the command set
class Ag34401A(Ag34401A_CS):
def __init__(self, resource_name=None):
rm = visa.ResourceManager()
self._inst = None
if resource_name:
inst = rm.open_resource(resource_name)
else:
for resource in rm.list_resources():
inst = rm.open_resource(resource)
if '34401' in inst.query("*IDN?"):
self._inst = inst
break
if not self._inst:
raise visa.errors.VisaIOError
assert isinstance(self._inst, visa.resources.GPIBInstrument)
self._query = self._inst.query
self._write = self._inst.write
I also have a SCPI instrument parent class that I was trying to fill with all of the star commands (*IDN?, *CLS, etc.) I wasn't happy with the previous results and was intimidated by some of the star commands' mounting complexity (some of which seemed really instrument-specific), and abandoned it. Here's the first bit
class SCPI_Instrument(object, metaclass=ABCMeta):
#abstractmethod
def query(self, message, delay):
pass
#abstractmethod
def write(self, message, delay):
pass
# *CLS - Clear Status
def cls(self, termination = None, encoding = None):
"""Clears the instrument status byte by emptying the error queue and clearing all event registers.
Also cancels any preceding *OPC command or query."""
return self.write("*CLS", termination, encoding)
In the end, the documentation I was able to squeeze into it wasn't as helpful as I wanted. The autocomplete wasn't able to prompt for arguments by name, and having every single command suggested was less helpful than I expected. Trying to keep the import paths straight was difficult enough for me, much less someone who might end up using my code.
Ultimately, coding has never been part of my job description, and I was moved to a position where experimenting with instrument control is even less possible.

Python multiprocessing hangs on heavy multithreading situation

I'm struggling with a very weird Python behavior I cannot get rid off of.
I created a Python external lib which I imports from inside my main python script. This library receive a huge (up to 50MB) JSON text document divided into sections. I need to parse each one of those sections and extract data with regex.
To speed up this procedure, and knowning the limits of Python not scaling on multicore CPUs, I decided to use the multiprocessing library to create as many processes as there are available core on the physical CPU.
The library splits the main JSON into sections and initializes the different multiprocessing.Process instances passing to each one of the processes a specific text section.
I do this this way:
p_one = multiprocessing.Process(
name=name,
target=functionOne,
args=(buffer1, id1)
)
p_one.start()
p_two = multiprocessing.Process(
name=name,
target=functionTwo,
args=(buffer2, id2)
)
p_two start()
p_three = multiprocessing.Process(
name=name,
target=functionThree,
args=(buffer3, id3)
)
p_three.start()
p_four = multiprocessing.Process(
name=name,
target=functionFour,
args=(buffer4, id4)
)
p_four.start()
p_one.join()
p_two.join()
p_three.join()
p_four.join()
This usually works, however from time to time one of the called joins hangs and it hangs whole main lib - thus preventing my main script from going ahead.
The child processes are not crashing tho, they use Google re2 as regex library and they finish their parsing routine.
As previously said, this doesn't happen everytime, it's very random I'd say. If I kill the whole process and restart it with the same buffer of strings, then it perfectly works, so the regex rules are not wrong nor they are hanging anything.
I tried with the multiprocessing map function, but it was ending up in creating zombie processes and thus I swapped to the multiprocessing.Process.
I also checked with Pyrasite and the blocked thread hangs on the join function.
"/usr/local/lib/python2.7/dist-packages/xxxxx/dispatcher.py", line 401, in p_domain
p_two.join()
File "/usr/lib/python2.7/multiprocessing/process.py", line 145, in join
res = self._popen.wait(timeout)
File "/usr/lib/python2.7/multiprocessing/forking.py", line 154, in wait
return self.poll(0)
File "/usr/lib/python2.7/multiprocessing/forking.py", line 135, in poll
pid, sts = os.waitpid(self.pid, flag)
Please do you have any hint/suggestion/something that could help me in understanding why this is happening and how to get this fixed?
Many thanks!
* Addition *
This is an example of a piece of code called inside the subprocess.Process() child
def check_domains(buf, id):
warnings = []
related_list = []
strings, _ = parse_json_string(buf)
for string in strings:
_current_result = _check_single_domain(string)
# If no result, add a warning message
if _current_result is False:
warnings.append("String timed out in DOMAIN section.")
else:
if _current_result is not None:
related_list.append(_current_result)
related_list = list(set(related_list))
[...cut...]
def _check_single_domain(string):
global DOMAIN_RESULT
# Check string length
if len(string) > MAX_STRING_LENGTH_FOR_DOMAIN:
return None
# Check unacceptable characters inside string
for unacceptable in UNACCEPTABLE_DOMAIN_CHARACTERS:
if unacceptable in string:
return None
# Check if string contains unacceptable characters
for required in REQUIRED_DOMAIN_CHARACTERS:
if required not in string:
return None
# Try to match string against Domain regex using Thread with timeout
thread = threading.Thread(name='Thread-DOMAIN', target=_perform_regex_against_string, args=(string, ))
thread.setDaemon(True)
thread.start()
thread.join(TIMEOUT_REGEX_FOR_DOMAIN_IN_SECONDS)
# If a time out occurred, return False that meaning no result got
if thread.isAlive():
return False
if DOMAIN_RESULT is None:
return None
# Domain can not starts or ends with a dot character
if DOMAIN_RESULT.endswith(".") or DOMAIN_RESULT.startswith("."):
return None
return DOMAIN_RESULT
def _perform_regex_against_string(string):
global DOMAIN_RESULT
# Set result to default value
DOMAIN_RESULT = None
# Regex for Domains
matched = re.search(REGEX, string, re.IGNORECASE)
if matched:
DOMAIN_RESULT = ''.join(matched.groups())

When working with a named pipe is there a way to do something like readlines()

Overall Goal: I am trying to read some progress data from a python exe to update the progress of the exe in another application
I have a python exe that is going to do some stuff, I want to be able to communicate the progress to another program. Based on several other Q&A here I have been able to have my running application send progress data to a named pipe using the following code
import win32pipe
import win32file
import glob
test_files = glob.glob('J:\\someDirectory\\*.htm')
# test_files has two items a.htm and b.htm
p = win32pipe.CreateNamedPipe(r'\\.\pipe\wfsr_pipe',
win32pipe.PIPE_ACCESS_DUPLEX,
win32pipe.PIPE_TYPE_MESSAGE | win32pipe.PIPE_WAIT,
1,65536,65536,300,None)
# the following line is the server-side function for accepting a connection
# see the following SO question and answer
""" http://stackoverflow.com/questions/1749001/named-pipes-between-c-sharp-and-python
"""
win32pipe.ConnectNamedPipe(p, None)
for each in testFiles:
win32file.WriteFile(p,each + '\n')
#send final message
win32file.WriteFile(p,'Process Complete')
# close the connection
p.close()
In short the example code writes the path of the each file that was globbed to the NamedPipe - this is useful and can be easily extended to more logging type events. However, the problem is trying to figure out how to read the content of the named pipe without knowing the size of each possible message. For example the first file could be named J:\someDirectory\a.htm, but the second could have 300 characters in the name.
So far the code I am using to read the contents of the pipe requires that I specify a buffer size
First establish the connection
file_handle = win32file.CreateFile("\\\\.\\pipe\\wfsr_pipe",
win32file.GENERIC_READ | win32file.GENERIC_WRITE,
0, None,
win32file.OPEN_EXISTING,
0, None)
and then I have been playing around with reading from the file
data = win32file.ReadFile(file_handle,128)
This generally works but I really want to read until I hit a newline character, do something with the content between when I started reading and the newline character and then repeat the process until I get to a line that has Process Complete in the line
I have been struggling with how to read only until I find a newline character (\n). I basically want to read the file by lines and based on the content of the line do something (either display the line or shift the application focus).
Based on the suggestion provided by #meuh I am updating this because I think there is a dearth of examples, guidance in how to use pipes
My server code
import win32pipe
import win32file
import glob
import os
p = win32pipe.CreateNamedPipe(r'\\.\pipe\wfsr_pipe',
win32pipe.PIPE_ACCESS_DUPLEX,
win32pipe.PIPE_TYPE_MESSAGE | win32pipe.PIPE_WAIT,
1,65536,65536,300,None)
# the following line is the server-side function for accepting a connection
# see the following SO question and answer
""" http://stackoverflow.com/questions/1749001/named-pipes-between-c-sharp-and-python
"""
win32pipe.ConnectNamedPipe(p, None)
for file_id in glob.glob('J:\\level1\\level2\\level3\\*'):
for filer_id in glob.glob(file_id + os.sep + '*'):
win32file.WriteFile(p,filer_id)
#send final message
win32file.WriteFile(p,'Process Complete')
# close the connection
p.close() #still not sure if this should be here, I need more testing
# I think the client can close p
The Client code
import win32pipe
import win32file
file_handle = win32file.CreateFile("\\\\.\\pipe\\wfsr_pipe",
win32file.GENERIC_READ |
win32file.GENERIC_WRITE,
0, None,win32file.OPEN_EXISTING,0, None)
# this is the key, setting readmode to MESSAGE
win32pipe.SetNamedPipeHandleState(file_handle,
win32pipe.PIPE_READMODE_MESSAGE, None, None)
# for testing purposes I am just going to write the messages to a file
out_ref = open('e:\\testpipe.txt','w')
dstring = '' # need some way to know that the messages are complete
while dstring != 'Process Complete':
# setting the blocksize at 4096 to make sure it can handle any message I
# might anticipate
data = win32file.ReadFile(file_handle,4096)
# data is a tuple, the first position seems to always be 0 but need to find
# the docs to help understand what determines the value, the second is the
# message
dstring = data[1]
out_ref.write(dstring + '\n')
out_ref.close() # got here so close my testfile
file_handle.close() # close the file_handle

I don't have windows but looking through the api it seems you should convert
your client to message mode by adding after the CreateFile() the call:
win32pipe.SetNamedPipeHandleState(file_handle,
win32pipe.PIPE_READMODE_MESSAGE, None, None)
then each sufficiently long read will return a single message, ie what the other wrote in a single write. You already set PIPE_TYPE_MESSAGE when you created the pipe.

You could simply use an implementation of io.IOBase that would wrap the NamedPipe.
class PipeIO(io.RawIOBase):
def __init__(self, handle):
self.handle = handle
def read(self, n):
if (n == 0): return ""
elif n == -1: return self.readall()
data = win32file.ReadFile(self.file_handle,n)
return data
def readinto(self, b):
data = self.read(len(b))
for i in range(len(data)):
b[i] = data[i]
return len(data)
def readall(self):
data = ""
while True:
chunk = win32file.ReadFile(self.file_handle,10240)
if (len(chunk) == 0): return data
data += chunk
BEWARE : untested, but it should work after fixing the eventual typos.
You could then do:
with PipeIO(file_handle) as fd:
for line in fd:
# process a line

You could use the msvcrt module and open to turn the pipe into a file object.
Sending code
import win32pipe
import os
import msvcrt
from io import open
pipe = win32pipe.CreateNamedPipe(r'\\.\pipe\wfsr_pipe',
win32pipe.PIPE_ACCESS_OUTBOUND,
win32pipe.PIPE_TYPE_MESSAGE | win32pipe.PIPE_WAIT,
1,65536,65536,300,None)
# wait for another process to connect
win32pipe.ConnectNamedPipe(pipe, None)
# get a file descriptor to write to
write_fd = msvcrt.open_osfhandle(pipe, os.O_WRONLY)
with open(write_fd, "w") as writer:
# now we have a file object that we can write to in a standard way
for i in range(0, 10):
# create "a\n" in the first iteration, "bb\n" in the second and so on
text = chr(ord("a") + i) * (i + 1) + "\n"
writer.write(text)
Receiving code
import win32file
import os
import msvcrt
from io import open
handle = win32file.CreateFile(r"\\.\pipe\wfsr_pipe",
win32file.GENERIC_READ,
0, None,
win32file.OPEN_EXISTING,
0, None)
read_fd = msvcrt.open_osfhandle(handle, os.O_RDONLY)
with open(read_fd, "r") as reader:
# now we have a file object with the readlines and other file api methods
lines = reader.readlines()
print(lines)
Some notes.
I've only tested this with python 3.4, but I believe you may be using python 2.x.
Python seems to get weird if you try to close both the file object and the pipe..., so I've only used the file object (by using the with block)
I've only created the file objects to read on one end and write on the other. You can of course make the file objects duplex by
Creating the file descriptors (read_fd and write_fd) with the os.O_RDWR flag
Creating the file objects in in "r+" mode rather than "r" or "w"
Going back to creating the pipe with the win32pipe.PIPE_ACCESS_DUPLEX flag
Going back to creating the file handle object with the win32file.GENERIC_READ | win32file.GENERIC_WRITE flags.

How to shutdown an httplib2 request when it is too long

I have a pretty annoying issue at the moment. When I process to a httplib2.request with a way too large page, I would like to be able to stop it cleanly.
For example :
from httplib2 import Http
url = 'http://media.blubrry.com/podacademy/p/content.blubrry.com/podacademy/Neuroscience_and_Society_1.mp3'
h = Http(timeout=5)
h.request(url, 'GET')
In this example, the url is a podcast and it will keep being downloaded forever. My main process will hang indefinitely in this situation.
I have tried to set it in a separate thread using this code and to delete straight my object.
def http_worker(url, q):
h = Http()
print 'Http worker getting %s' % url
q.put(h.request(url, 'GET'))
def process(url):
q = Queue.Queue()
t = Thread(target=http_worker, args=(url, q))
t.start()
tid = t.ident
t.join(3)
if t.isAlive():
try:
del t
print 'deleting t'
except: print 'error deleting t'
else: print q.get()
check_thread(tid)
process(url)
Unfortunately, the thread is still active and will continue to consume cpu / memory.
def check_thread(tid):
import sys
print 'Thread id %s is still active ? %s' % (tid, tid in sys._current_frames().keys() )
Thank you.

Ok I found an hack to be able to deal with this issue.
The best solution so far is to set a maximum of data read and to stop reading from the socket. The data is read from the method _safe_read of httplib module. In order to overwrite this method, I used this lib : http://blog.rabidgeek.com/?tag=wraptools
And voila :
from httplib import HTTPResponse, IncompleteRead, MAXAMOUNT
from wraptools import wraps
#wraps(httplib.HTTPResponse._safe_read)
def _safe_read(original_method, self, amt):
"""Read the number of bytes requested, compensating for partial reads.
Normally, we have a blocking socket, but a read() can be interrupted
by a signal (resulting in a partial read).
Note that we cannot distinguish between EOF and an interrupt when zero
bytes have been read. IncompleteRead() will be raised in this
situation.
This function should be used when <amt> bytes "should" be present for
reading. If the bytes are truly not available (due to EOF), then the
IncompleteRead exception can be used to detect the problem.
"""
# NOTE(gps): As of svn r74426 socket._fileobject.read(x) will never
# return less than x bytes unless EOF is encountered. It now handles
# signal interruptions (socket.error EINTR) internally. This code
# never caught that exception anyways. It seems largely pointless.
# self.fp.read(amt) will work fine.
s = []
total = 0
MAX_FILE_SIZE = 3*10**6
while amt > 0 and total < MAX_FILE_SIZE:
chunk = self.fp.read(min(amt, httplib.MAXAMOUNT))
if not chunk:
raise IncompleteRead(''.join(s), amt)
total = total + len(chunk)
s.append(chunk)
amt -= len(chunk)
return ''.join(s)
In this case, MAX_FILE_SIZE is set to 3Mb.
Hopefully, this will help others.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python process communications via pipes: Race condition - python

Related

Loop to check if a variable has changed in Python

What's the best way to implement an SCPI command tree structure as Python class methods?

Python multiprocessing hangs on heavy multithreading situation

When working with a named pipe is there a way to do something like readlines()

How to shutdown an httplib2 request when it is too long

Categories

Resources