I have two applications interacting over a TCP/IP connection; now I need them to be able to interact over a serial connection as well.
There are a few differences between socket IO and serial IO that make porting less trivial than I hoped for.
One of the differences is about the semantics of send/write timeouts and the assumptions an application may make about the amount of data successfully passed down the connection. Knowing this amount the application also knows what leftover data it needs to transmit later should it choose so.
Socket.send
A call like socket.send(string) may produce the following results:
The entire string has been accepted by the TCP/IP stack, and the
length of the string is returned.
A part of the string has been accepted by the TCP/IP stack, and the
length of that part is returned. The application may transmit the
rest of the string later.
A socket.timeout exception is raised if the socket is configured to
use timeouts and the sender overwhelms the connection with data.
This means (if I understand it correctly) that no bytes of the
string have been accepted by the TCP/IP stack and hence the
application may try to send the entire string later.
A socket.error exception is raised because of some issues with the
connection.
PySerial.Serial.write
The PySerial API documentation says the following about Serial.write(string):
write(data)
Parameters:
data – Data to send.
Returns:
Number of bytes written.
Raises
SerialTimeoutException:
In case a write timeout is configured for the port and the time is exceeded.
Changed in version 2.5: Write returned None in previous versions.
This spec leaves a few questions uncertain to me:
In which circumstances may "write(data)" return fewer bytes written
than the length of the data? Is it only possible in the non-blocking
mode (writeTimeout=0)?
If I use a positive writeTimeout and the SerialTimeoutException is
raised, how do I know how many bytes went into the connection?
I also observe some behaviors of serial.write that I did not expect.
The test tries sending a long string over a slow connection. The sending port uses 9600,8,N,1 and no flow control. The receiving port is open too but no attempts to read data from it are being made.
If the writeTimeout is positive but not large enough the sender expectedly
gets the SerialTimeoutException.
If the writeTimeout is set large enough the sender expectedly gets all data written
successfully (the receiver does not care to read, neither do we).
If the writeTimeout is set to None, the sender unexpectedly gets the SerialTimeoutException
instead of blocking until all data goes down the connection. Am I missing something?
I do not know if that behavior is typical.
In case that matters, I experiment with PySerial on Windows 7 64-bit using two USB-to-COM adapters connected via a null-modem cable; that setup seems to be operational as two instances of Tera Term can talk to each other over it.
It would be helpful to know if people handle serial write timeouts in any way other than aborting the connection and notifying the user of the problem.
Since I currently do not know the amount of data written before the timeout has occurred, I am thinking of a workaround using non-blocking writes and maintaining the socket-like timeout semantics myself above that level. I do not expect this to be a terrifically efficient solution (:-)), but luckily my applications exchange relatively infrequent and short messages so the performance should be within the acceptable range.
[EDITED]
A closer look at non-blocking serial writes
I wrote a simple program to see if I understand how the non-blocking write works:
import serial
p1 = serial.Serial("COM11") # My USB-to-COM adapters appear at these high port numbers
p2 = serial.Serial("COM12")
message = "Hello! " * 10
print "%d bytes in the whole message: %r" % (len(message), message)
p1.writeTimeout = 0 # enabling non-blocking mode
bytes_written = p1.write(message)
print "Written %d bytes of the message: %r" % (bytes_written, message[:bytes_written])
print "Receiving back %d bytes of the message" % len(message)
message_read_back = p2.read(len(message))
print "Received back %d bytes of the message: %r" % (len(message_read_back), message_read_back)
p1.close()
p2.close()
The output I get is this:
70 bytes in the whole message: 'Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! '
Written 0 bytes of the message: ''
Receiving back 70 bytes of the message
Received back 70 bytes of the message: 'Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! '
I am very confused: the sender thinks no data was sent yet the receiver got it all. I must be missing something very fundamental here...
Any comments / suggestions / questions are very welcome!
Since it isn't documented, let's look at the source code. I only looked at the POSIX and Win32 implementations, but it's pretty obvious that on at least those two platforms:
There are no circumstances when write(data) may return fewer bytes written than the length of the data, timeout or otherwise; it always either returns the full len(data), or raises an exception.
If you use a positive writeTimeout and the SerialTimeoutException is raised, there is no way at all to tell how many bytes were sent.
In particular, on POSIX, the number of bytes sent so far is only stored on a local variable that's lost as soon as the exception is raised; on Windows, it just does a single overlapped WriteFile and raises an exception for anything but a successful "wrote everything".
I assume that you care about at least one of those two platforms. (And if not, you're probably not writing cross-platform code, and can look at the one platform you do care about.) So, there is no direct solution to your problem.
If the workaround you described is acceptable, or a different one (like writing exactly one byte at a time—which is probably even less efficient, but maybe simpler), do that.
Alternatively, you will have to edit the write implementations you care about (whether you do this by forking the package and editing your fork, monkeypatching Serial.write at runtime, or just writing a serial_write function and calling serial_write(port, data) instead of port.write(data) in your script) to provide the information you want.
That doesn't look too hard. For example, in the POSIX version, you just have to stash len(data)-t somewhere before either of the raise writeTimeoutError lines. You could stick it in an attribute of the Serial object, or pass it as an extra argument to the exception constructor. (Of course if you're trying to write a cross-platform program, and you don't know all of the platforms well enough to write the appropriate implementations, that isn't likely to be a good answer.)
And really, given that it's not that hard to implement what you want, you might want to add a feature request (ideally with a patch) on the pyserial tracker.
Related
First off, I've got the following code that... works. Apparently.
while not self.socket_connected:
try:
client_socket.connect((self.hostname, self.port))
self.socket_connected = True
except:
sleep(0.5)
pass
while self.socket_connected:
message = client_socket.recv(4096)
if(message == b''):
client_socket.close()
self.socket_connected = False
break
#...do stuff
I say "apparently" because I'm reading conflicting sources about how one ought to implement sockets in Python.
Firstly, you've got information as here and here that would have you believe an empty buffer is a disconnected socket. That must've been what I read first (the code above is a few months old at this point, and my first serious attempt at sockets in Python).
However, there's also this post that seems a little better informed. That is, if the buffer is empty, it just means you've read everything available for now. Kind of like how I understand TCP to work in the first place. And maybe I missed it, but is that even mentioned in the docs?
Anyway... what I realized about my code is that, every time the buffer is empty, I drop the client-side socket and reconnect to read new information. That's obviously not ideal, and I'd like to change it.
In C, if recv returns zero, the buffer is empty. If it returns <0, something's gone wrong and you can destroy the file descriptor and attempt to reestablish the connection. How is one supposed to do the same in Python?
EDIT: Just as a bit more context - I've got the first five bytes of the messages being received here encoded to the size of the overall message, so I'll be able to test for 'done-ness' internally, provided that I can distinguish between an empty buffer and a dropped socket.
EDIT 2: What I'm asking specifically is how to check Python sockets for both an empty buffer as well as a dropped connection. Both should be handled differently, of course, and I need to make sure I'm getting the full message by possibly doing multiple recv() calls.
By default socket.recv is a blocking call, it'll suspend the thread until there is something it has received in the buffer, then it'll return the whole buffer.
When a socket is disconnected the buffer will become b'' not empty, so it returns b''.
Non blocking sockets that have no data when you call .recv will return a socket exception.
So to answer what I believe the question is, by default, socket.recv will return b'' when the client disconnects.
edit: To see if a socket is empty, you could disable blocking, and then catch the exception that would be thrown by calling recv when the buffer is empty.
Alternatively using the select module with sockets to sort your sockets into 3 lists, ready to read, ready to write, and sockets with exceptions.
https://docs.python.org/3/library/select.html#select.select
I am using two libraries to connect with a port, and two of them uses different styles in writing these commands. I want to understand the difference because I want to use the second one, but it results in port becoming unresponsive after some time, I wonder if it causes a kind of overloading. Here are the methods.
Method 1:
if self.port:
self.port.flushOutput()
self.port.flushInput()
for c in cmd:
self.port.write(c)
self.port.write("\r\n")
Method 2:
if self.port:
cmd += b"\r\n"
self.port.flushInput()
self.port.write(cmd)
self.port.flush()
The major difference I first encounter is that the first one splitting the command in to letters then send it. I wonder if this makes any difference. And as I said the second code fails after some time( it is unclear, if these methods are the problem). I don't understand what flushes do there. I want to understand the difference between these and know if the second one prone to errors.
Note: Please note that self.port is serial.Serial object.
Any advice appreciated.
Well, from the pySerial documentation the function flushInput has been renamed to reset_input_buffer and flushOutput to reset_output_buffer:
reset_input_buffer()
Flush input buffer, discarding all it’s contents.
Changed in version 3.0: renamed from flushInput()
reset_output_buffer()
Clear output buffer, aborting the current output and discarding all that is in the buffer.
Changed in version 3.0: renamed from flushOutput()
The first method is less likely to fail because the output buffer is reset before attempting a write. This implies that the buffer is always empty before writing, hence the odds the write will fail are lower.
The problem is that both the methods are error prone:
There is no guarantee that all the data you are attempting to write will be written by the write() function, either with or without the for loop. This can happen if the output buffer is already full. But the write() functions returns the number of bytes successfully written to the buffer. Hence you should loop untill the number of written bytes is equal to the number of bytes you wanted to write:
toWrite = "the command\r\n"
written = 0
while written < len(toWrite) :
written += self.port.write(toWrite[written:])
if written == 0 :
# the buffer is full
# wait untill some bytes are actually transmitted
time.slepp(100)
Note that "writing to the buffer" doesn't mean that the data is instantly trasmitted on the serial port, the buffer will be flushed on the serial port when the operative system thinks is time to do so, or when you force it by calling the flush() function which will wait for all the data to be written on the port.
Note also that this approach will block the thread execution untill the write is successfully completed, this can take a while if the serial port is slow or you want to write a big amount of data.
If your program is ok with that you are fine, otherwise you can dedicate a different thread to serial port communication or adopt a non-blocking approach. In the former you will have to handle multithread communication, in the latter you will have to manage internally your buffer and delete only the successfully written bytes.
Finally if your program is really simple an approach like this should do the trick:
if self.port:
cmd+=b"\r\n"
for c in cmd:
self.port.write(c)
self.port.flush()
But it will be extremely unefficient.
I was wondering if there is a way I can tell python to wait until it gets a response from a server to continue running.
I am writing a turn based game. I make the first move and it sends the move to the server and then the server to the other computer. The problem comes here. As it is no longer my turn I want my game to wait until it gets a response from the server (wait until the other player makes a move). But my line:
data=self.sock.recv(1024)
hangs because (I think) it's no getting something immediately. So I want know how can I make it wait for something to happen and then keep going.
Thanks in advance.
The socket programming howto is relevant to this question, specifically this part:
Now we come to the major stumbling block of sockets - send and recv operate on the
network buffers. They do not necessarily handle all the bytes you hand them (or expect
from them), because their major focus is handling the network buffers. In general, they
return when the associated network buffers have been filled (send) or emptied (recv).
They then tell you how many bytes they handled. It is your responsibility to call them
again until your message has been completely dealt with.
...
One complication to be aware of: if your conversational protocol allows multiple
messages to be sent back to back (without some kind of reply), and you pass recv an
arbitrary chunk size, you may end up reading the start of a following message. You’ll
need to put that aside >and hold onto it, until it’s needed.
Prefixing the message with it’s length (say, as 5 numeric characters) gets more complex,
because (believe it or not), you may not get all 5 characters in one recv. In playing
around, you’ll get away with it; but in high network loads, your code will very quickly
break unless you use two recv loops - the first to determine the length, the second to
get the data part of the message. Nasty. This is also when you’ll discover that send
does not always manage to get rid of everything in one pass. And despite having read
this, you will eventually get bit by it!
The main takeaways from this are:
you'll need to establish either a FIXED message size, OR you'll need to send the the size of the message at the beginning of the message
when calling socket.recv, pass number of bytes you actually want (and I'm guessing you don't actually want 1024 bytes). Then use LOOPs because you are not guaranteed to get all you want in a single call.
That line, sock.recv(1024), blocks until 1024 bytes have been received or the OS detects a socket error. You need some way to know the message size -- this is why HTTP messages include the Content-Length.
You can set a timeout with socket.settimeout to abort reading entirely if the expected number of bytes doesn't arrive before a timeout.
You can also explore Python's non-blocking sockets using setblocking(0).
How do I get the following code to break up large files into smaller parts and send those parts, instead of sending the whole file? It fails to send large files (Tested with an ubuntu iso around 600mb)
...some code
# file transfer
with open(sendFile, "rb") as f:
while 1:
fileData = f.read()
if fileData == "": break
# send file
s.sendall(EncodeAES(cipher, fileData))
f.close()
...more code
I tried with f.read(1024), but that didn't work.
Finally, when splitting up the files, I would need to be able to put the parts together again.
I'm also encrypting the files using PyCrypto, if that has any impact on what I'm trying to do. Guess it would be smartest to encrypt the seperate parts, instead of encrypting the whole file and then splitting that into parts.
Hope the above code is enough. If not, I'll update with more code.
I may be wrong, but I'm betting that your actual problem is not what you think it is, and it's the same reason your attempt to fix it by reading 1K at a time didn't help. Apologies if I'm wrong, and you already know this basic stuff.
You're trying to send your cipher text like this:
s.sendall(EncodeAES(cipher, fileData))
There is certainly no length information, no delimiter, etc. within this code. And you can't possibly be sending length data outside this function, because you don't know how long the ciphertext will be before getting to this code.
So, I'm guessing the other side is doing something like this:
data = s.recv(10*1024*1024)
with open(recvFile, "wb") as f:
f.write(DecodeAES(cipher, data))
Since the receiver has no way of knowing where the encrypted file ends and the next encrypted file (or other message) begins, all it can do is try to receive "everything" and then decrypt it. But that could be half the file, or the file plus 6-1/2 other messages, or the leftover part of some previous message plus half the file, etc. TCP sockets are just streams of bytes, not sequences of separate messages. If you want to send messages, you have to build a protocol on top of TCP.
I'm guessing the reason you think it only fails with large files is that you're testing on localhost, or on a simple LAN. In that case, for smallish sends, there's a 99% chance that you will recv exactly as much as you sent. But once you get too big for one of the buffers along the way, it goes from working 99% of the time to 0% of the time, so you assume the problem is that you just can't send big files.
And the reason you think that breaking it into chunks of 1024 bytes gives you gibberish is that it means you're doing a whole bunch of messages in quick succession, making it much less likely that the send and recv calls will match up one-to-one. (Or this one may be even simpler—e.g., you didn't match the changes on the two sides, so you're not decrypting the same way you're encrypting.)
Whenever you're trying to send any kind of messages (files, commands, whatever) over the network, you need a message-based protocol. But TCP/IP is a byte-stream-based protocol. So, how do you handle that? You build a message protocol on top of the stream protocol.
The easiest way to do that is to take a protocol that's already been designed for your purpose, and that already has Python libraries for the client and either Python libraries or a stock daemon that you can just use as-is for the server. Some obvious examples for sending a file are FTP, TFTP, SCP, or HTTP. Or you can use a general-purpose protocol like netstring, JSON-RPC, or HTTP.
If you want to learn to design and implement protocols yourself, there are two basic approaches.
First, you can start with Twisted, monocle, Tulip, or some other framework that's designed to do all the tedious and hard-to-get-right stuff so you only have to write the part you care about: turning bytes into messages and messages into bytes.
Or you can go bottom-up, and build your protocol handler out of basic socket calls (or asyncore or something else similarly low-level). Here's a simple example:
def send_message(sock, msg):
length = len(msg)
if length >= (1 << 32):
raise ValueError('Sorry, {} is too big to fit in a 4GB message'.format(length))
sock.sendall(struct.pack('!I', length))
sock.sendall(msg)
def recv_bytes(sock, length):
buf = ''
while len(buf) < length:
received = sock.recv(4-len(buf))
if not received:
if not buf:
return buf
raise RuntimeError('Socket seems to have closed in mid-message')
buf += received
return buf
def recv_message(sock):
length_buf = recv_bytes(sock, 4)
length = struct.unpack('!I', buf)
msg_buf = recv_bytes(sock, length)
return msg_buf
Of course in real life, you don't want to do tiny little 4-byte reads, which means you need to save up a buffer across multiple calls to recv_bytes. More importantly, you usually want to turn the flow of control around, with a Protocol or Decoder object or callback or coroutine. You feed it with bytes, and it feeds something else with messages. (And likewise for the sending side, but that's always simpler.) By abstracting the protocol away from the socket, you can replace it with a completely different transport—a test driver (almost essential for debugging protocol handlers), a tunneling protocol, a socket tied to a select-style reactor (to handle multiple connections at the same time), etc.
I'd like to write a simple command line proxy in Python to sit between a Telnet/SSH connection and a local serial interface. The application should simply bridge I/O between the two, but filter out certain unallowed strings (matched by regular expressions). (This for a router/switch lab in which the user is given remote serial access to the boxes.)
Basically, a client established a Telnet or SSH connection to the daemon. The daemon passes the client's input out (for example) /dev/ttyS0, and passes input from ttyS0 back out to the client. However, I want to be able to blacklist certain strings coming from the client. For instance, the command 'delete foo' should not be allowed.
I'm not sure how best to approach this. Communication must be asynchronous; I can't simply wait for a carriage return to allow the buffer to be fed out the serial interface. Matching regular expressions against the stream seems tricky too, as all of the following must be intercepted:
delete foo(enter)
del foo(enter)
el foo(ctrl+a)d(enter)
dl(left)e(right) foo(enter)
...and so forth. The only solid delimiter is the CR/LF.
I'm hoping someone can point me in the right direction. I've been looking through Python modules but so far haven't come up with anything.
Python is not my primary language, so I'll leave that part of the answer for others. I do alot of security work, though, and I would urge a "white list" approach, not a "black list" approach. In other words, pick a set of safe commands and forbid all others. This is much much easier than trying to think of all the malicious possibilities and guarding against all of them.
As all the examples you show finish with (enter), why is it that...:
Communication must be asynchronous; I
can't simply wait for a carriage
return to allow the buffer to be fed
out the serial interface
if you can collect incoming data until the "enter", and apply the "edit" requests (such as the ctrl-a, left, right in your examples) to the data you're collecting, then you're left with the "completed command about to be sent" in memory where it can be matched and rejected or sent on.
If you must do it character by character, .read(1) on the (unbuffered) input will allow you to, but the vetting becomes potentially more problematic; again you can keep an in-memory image of the edited command that you've sent so far (as you apply the edit requests even while sending them), but what happens when the "enter" arrives and your vetting shows you that the command thus composed must NOT be allowed -- can you e.g. send a number of "delete"s to the device to wipe away said command? Or is there a single "toss the complete line" edit request that would serve?
If you must send every character as you receive it (not allowed to accumulate them until decision point) AND there is no way to delete/erase characters already sent, then the task appears to be impossible (though I don't understand the "can't wait for the enter" condition AT ALL, so maybe there's hope).
After thinking about this for a while, it doesn't seem like there's any practical, reliable method to filter on client input. I'm going to attempt this from another angle: if I can identify persistent patterns in warning messages coming from the serial devices (e.g. confirmation prompts) I may be able to abort reliably. Thanks anyway for the input!
Fabric is doing a similar thing.
For SSH api you should check paramiko.