Named Pipe in python dies when multiprocessing - python

first question so please be gentle.
i am using python.
when creating a named pipe to a c++ windows program with
PIPE = open(r'\\.\pipe\NamedPipe','rb+',0)
as global i can read/write from and to the pipe.
def pipe_writer():
PIPE.write(some_stuff)
def pipe_reader():
data = struct.unpack("byte-type",PIPE.read(number_of_bytes),0)
pipe_writer()
pipe_reader()
this is fine to collect data from the pipe and process the complete data with several functions, one function after the other.
unfortunately i have to process the data bit by bit as i pull it from the pipe with several functions in a serialized manner.
i thought that queueing the data would just do the job so i use the multiprocess module.
when i try to multiprocess i am able to create the pipe and send data once when opening it it after:
if __name__ == '__main__':
PIPE = open(r'\\.\pipe\NamedPipe','rb+',0)
PIPE.write(some_stuff)
when I then try to .start() the functions as processes and read from the pipe I get an error that the pipe doesn't exist or is open in the wrong mode, which can't really be as it works just fine when reading/writing to it without using Process() on the functions AND i can write to it ... even if it's only once.
any suggestions? Also I think I kinda need to use multiprocess as threading doesn't work ... probably ... because of the GIL and slowing stuff down.

If you're in control of the C++ source code too, you can save yourself a lot of code and hassle by moving on to using ZeroMQ or Nanomsg instead of the pipe, and Google Protocol Buffers instead of interpreting a byte stream yourself.
ZeroMQ and Nanomsg are like networks / pipes /IPC on steroids, and are much easier to use than raw pipes, sockets, etc. You have less source code and more functionality : win-win.
Google's protocol Buffers allow you to define data structures (messages) in a language neutral way, and then auto generate source code in C++, Python, Java or whatever. This source code defines structs, classes, etc that represent the messages and also converts them to a standard binary format. That binary data is what you'll send via ZeroMQ. Again, less source code for you to write, more functionality.
This is ideal for getting C++ classes into Python and vice versa.

nanomsg python wrapper is also available on GitHub at Nanomsg Python.
Examples you can see at Examples. I guess this wrapper will serve your purpose. It's always better to use this in place of raw PIPEs. It supports IPC, Between Process and TCP communication patterns.
Moreover it is crossplatform and it's basic implementation is in C. So I guess communication between python and C process can also be made possible.

Related

Do multiprocessing pipes fill up, and if so what behavior do they cause in a program?

I am working on a threaded Python program and am using pipes, but found that they freeze at a certain point (what I would consider a relatively small amount of data). I have a test case below. I've tried digging into documentation but have been unable to find anything.
import multiprocessing
def test():
out_, in_ = multiprocessing.Pipe()
for i in range(10**6):
print(i)
in_.send(i)
When I run this code, it prints to 278 then stops, which seems to be a small amount of data. Is this due to it running out of memory or something else? Are there any workarounds or parameters I could use to increase the size?
Yes, pipes have a limited amount of storage, the amount depends on the operating system. If a process tries to write to the pipe faster than another process reads from it, the pipe will eventually fill up. When that happens, the next attempt to write will block until the reader reads enough to make space for it.
The pipe can also be put into non-blocking mode. In this case, trying to write to a full pipe will return an error indication, and the writing code can decide how to deal with it. It doesn't appear that the Python multiprocessing module has a way to set a pipe to non-blocking. Python multiprocess non-blocking intercommunication using Pipes says that the way to do this kind of processing is to use in_.poll() to tell whether the pipe is writable.

Inter-process communication using physical text files

I am reading financial data from my broker in real time through a websocket API. The client is written in Python. I have another C++ program that reads that data, but the way I am communicating with the python script is through a physical text file.
My questions are...
1) Does constantly rewriting the textfile, opening, reading and closing it everytime affects performance? If so, what's a better way to do it? Performance on my application is crucial.
2) Would using named pipes be a better option? Or is that pretty much the same as writing and reading to a text file?
Modern OS support many different IPC. Pipes, named pipes, sockets, memory mapped files, ... The choice of one solution or the other is very dependent of your application. But broadly speaking, all of them should be "better" than using a plain-old-file.
As IPC are objects managed by the OS, they are not dependent of the language used to write the various process. Some IPC have a file semantic (pipes, named pipes). Other require the use of some dedicated system primitive (mmap). But C++ and Python (and many other language) will support the required system call. In fact, IPC are great to help software written in different languages to speak together.

Serial communication in Linux user space (non blocking)

I'm trying to get an arduino board communicate with a beaglebone ( BB) white running Ubuntu using UART. I have read that the BB uart driver is already interrupt driven.
I want to store all incoming data into a sort of buffer which I can read when required, similar to the way it's done in microcontrollers. But I'm trying to avoid kernel programming so I won't be able to use the driver's data structures. I'm looking for a complete user space solution.
I'm planning to use two python processes, one to write all incoming data (to a shared list) and the other to read it as required so that the read is non blocking.
I have two questions:
Is this the right approach? if yes, please suggest a simple interprocess communication method that will suffice.
What is the right way to implement this?
Note: I'm using the PyBBIO library that reads and writes directly to the /dev/mem special file.
You might want to use pyserial, which uses the kernel interfaces (I don't know what PyBBIO does). It provides automatic input buffering - so you don't need an extra process. If you do want to have more processes use multiprocessing. A simpler alternative is threading, which saves you the communication part. For multiprocessing with network support use Ipython's cluster

Python os.pipe vs multiprocessing.Pipe

Recently I'm studying parallel programming tools in Python. And here are two major differences between os.pipe and multiprocessing.Pipe.(despite the occasion they are used)
os.pipe is unidirectional, multiprocessing.Pipe is bidirectional;
When putting things into pipe/receive things from pipe, os.pipe uses encode/decode, while multiprocessing.Pipe uses pickle/unpickle
I want to know if my understanding is correct, and is there other difference? Thank you.
I believe everything you've stated is correct.
On Linux, os.pipe is just a Python interface for accessing traditional POSIX pipes. On Windows, it's implemented using CreatePipe. When you call it, you get two ordinary file descriptors back. It's unidirectional, and you just write bytes to it on one end that get buffered by the kernel until someone reads from the other side. It's fairly low-level, at least by Python standards.
multiprocessing.Pipe objects are much more high level interface, implemented using multiprocessing.Connection objects. On Linux, these are actually built on top of POSIX sockets, rather than POSIX pipes. On Windows, they're built using the CreateNamedPipe API. As you noted, multiprocessing.Connection objects can send/receive any picklable object, and will automatically handle the pickling/unpickling process, rather than just dealing with bytes. They're capable of being both bidirectional and unidirectional.

IPC between python app and injected DLL

Hello stack overflow: Sometimes reader, first time poster.
Background:
Windows box running XP SP3, soon to be upgraded to Windows Seven (MSDNAA <3)
I have an injected DLL which gets cycles by hooking a function that is called thousands of times a second.
I would like to communicate/control this DLL via a python app. Basically, the DLL does the work, the python app supplies the brains/decision making.
My game plan for doing this, is I would have a counter and an if statement in the DLL. Each time the hooked function is called, counter++ and then jump back to the original function until something like if ( counter == 250 ) { // dostuff(); }. My though behind this it will allow the target app to run mostly unimpeded, but will still let me do interesting things.
Problem:
I'm at an utter loss on which IPC method I should use to do the communication. We have sockets, shared memory, pipes, filemapping(?), RPC, and other (seemingly) esoteric stuff like writing to the clipboard.
I've NEVER implemented any kind of IPC beyond toy examples.
I'm fairly sure I need something that:
Can handle talking back and forth between python and a DLL
Doesn't block/wait
Can check for waiting data, and continue if there isn't any
If locks are involved, can continue instead of waiting
Doesn't cost lots of time to read/write too
Help? Thank you for your time, I hope I've provided enough general information and not broken any accepted conventions.
I would like to add that the related questions box is very cool, and I did peruse it before posting.
Try sockets. Your demands are essentially a requirement of asynchronous operating; Python has asyncore module for asynchronous IO on sockets. At the same time, it doesn't look like Python's stdlib can asynchronously handle other IPC things, so I'd not recommend using them.
If you don't care about realtime, then you can use the file system for communication: a log file for the DLL's output, and a config file that is read every now and then to change the DLLs behavior.

Categories